mixup: Beyond Empirical Risk Minimization
Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz | ICLR 2018
In a Nutshell 🥜
This paper introduces a very simple and straightforward yet surprisingly effective augmentation for classifications: mixup1.
Formally, the concept of mixup is to combine two samples together through a weighted average, and combine the labels together accordingly:
Intuitively, this gives out a set of linear examples between Xi to Xj, and hence becomes an extremely effective data augmentation method.
Detailed empirical studies and ablations were performed on various datasets to show that the method is effective across.
Some Thoughts ðŸ’
The concept of mixup is surprisingly simple, and yet achieving state-of-the-arts in many cases/tasks.
Numerous papers have since then developed concepts based on this mixup to different computer vision tasks (e.g., PointMixup)
Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.