Tiny Titans: Efficient Large Vision, Language and Multimodal Models Through Pruning

Carolina Tavares; Leandro Mugnaini; Gustavo Henrique do Nascimento; Ian Pons; Keith Ogawa; Guilherme Stern; Lucas Libanio; Aline Paes; Anna Helena Reali Costa; Artur Jordao

Carolina Tavares UFF
Leandro Mugnaini USP
Gustavo Henrique do Nascimento UFF
Ian Pons USP
Keith Ogawa USP
Guilherme Stern USP
Lucas Libanio USP
Aline Paes USP
Anna Helena Reali Costa USP
Artur Jordao USP

Resumo

Notable progress in solving complex reasoning tasks relies on large models. Unfortunately, developing these models demands substantial computational resources and energy consumption. Hence, the industry pushes the most significant advances in state-of-the-art models and draws the attention of the scientific community to the environmental impact of AI (GreenAI). Pruning emerges as an effective mechanism to address the capacity-computational cost dilemma by eliminating structures (weights, neurons or layers) from deep models. This tutorial introduces theoretical and technical foundations within this promising, active and exciting field. It delves into pruning techniques as a pillar of GreenAI and a foundation for the next wave of efficient large vision, language, and multimodal models. Our tutorial also covers how existing forms of pruning impact efficiency gains, guiding participants to make informed choices for their scenario and infrastructure. Specifically, we equip participants with the basics and key recipes to effectively apply pruning in practical computer vision scenarios. Additional material is available at: github.com/arturjordao/TinyTitans

Palavras-chave: Industries, Graphics, Energy consumption, Computer vision, Costs, Computational modeling, Green products, Neurons, Tutorials, Cognition