MARVA: Modular Architecture for Robust Visual Agents

Vanessa Schenkel; Gabriel de Oliveira Ramos

doi:10.5753/eramiars.2025.16789

Vanessa Schenkel Unisinos
Gabriel de Oliveira Ramos Unisinos

DOI: https://doi.org/10.5753/eramiars.2025.16789

Resumo

Generalization in visual RL is challenging: small visual shifts can degrade performance. We present MARVA, a dual-regularization extension of MaDi combining a GRL-based discriminator and a contrastive (InfoNCE) loss on masked views. On walker-walk, MARVA matches baseline performance in easier domains and improves robustness in video_hard and DistractingCS.

Referências

Bertoin, D., Zouitine, A., Zouitine, M., and Rachelson, E. (2022). Look where you look! saliency-guided q-networks for generalization in visual reinforcement learning. Advances in neural information processing systems, 35:30693–30706.

Grooten, B., Tomilin, T., Vasan, G., Taylor, M. E., Mahmood, R. A., Fang, M., Pechenizkiy, M., and Mocanu, D. C. (2024). Madi: Learning to mask distractions for generalization in visual deep reinforcement learning. In AAMAS’24: 2024 International Conference on Autonomous Agents and Multiagent Systems. IFAAMAS.

Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870.

Hansen, N., Su, H., and Wang, X. (2021). Stabilizing deep q-learning with convnets and vision transformers under data augmentation. Advances in neural information processing systems, 34:3680–3693.

Laskin, M., Lee, K., Stooke, A., Pinto, L., Abbeel, P., and Srinivas, A. (2020). Reinforcement learning with augmented data. Advances in neural information processing systems, 33:19884–19895.

Li, B., François-Lavet, V., Doan, T., and Pineau, J. (2021). Domain adversarial reinforcement learning. arXiv preprint arXiv:2102.07097.

Pinto, L., Davidson, J., Sukthankar, R., and Gupta, A. (2017). Robust adversarial reinforcement learning. In Intl. Conf. on Machine Learning, pages 2817–2826. PMLR.

Yarats, D., Kostrikov, I., and Fergus, R. (2021). Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International conference on learning representations.