Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros | ICCV 2017
In a Nutshell 🥜
Zhu et al.1 tackle the task of image-to-image translation, such as from photos to paintings or even from a horse to a zebra. Formally, the task is to learn a mapping function G: X→Y to translate an image x in source domain X into target domain Y such that ŷ = G(x) is indistinguishable from the distribution of Y. Typically, this task uses an adversarial loss with an adversary that tries to distinguish ŷ apart from a paired image y. However, the paper specifically looks at cases where there is an absence of paired examples x-y to learn from.
To address the absence of paired examples, the paper introduces a cycle consistency loss. The cycle consistency loss is based on the assumption that if we translate an image x in domain X to domain Y with G: X→Y, then translate it back to domain X with F: Y→X, we should arrive back at the image x. Applying this assumption, the paper trains G and F simultaneously an adversarial loss with the addition of a cycle consistency loss that encourages F(G(x)) ≈ x and G(F(y)) ≈ y.
The paper then performs ablation studies and evaluations against baselines and demonstrates improvements in performance. The paper also provides several qualitative examples on a variety of image-to-image translation tasks, such as photos to paintings of various styles, horse to zebra, summer scenery to winter scenery, apples to oranges, semantic segmentation to photos, and Google maps to aerial photos.
Some Thoughts 💭
This paper contributes a cycle consistency loss for the task of image-to-image translation, inspired by the techniques of back translation and reconciliation used by human translators.
I enjoy the paper’s clarity in explaining both the intuition and implementation of the cycle consistency loss.
The examples provided in the paper and on the project website also nicely illustrate the fun and creative potential applications.
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223-2232).