Abstract
Over the last decade, deep generative models have evolved to generate realistic and sharp images. The success of these models is often attributed to an extremely large number of trainable parameters and an abundance of training data, with limited or no understanding of the underlying data manifold. In this article, we explore the possibility of learning a deep generative model that is structured to better capture the underlying manifold's geometry, to effectively improve image generation while providing implicit controlled generation by design. Our approach structures the latent space into multiple disjoint representations capturing different attribute manifolds. The global representations are guided by a disentangling loss for effective attribute representation learning and a differential manifold divergence loss to learn an effective implicit generative model. Experimental results on a 3D shapes dataset demonstrate the model's ability to disentangle attributes without direct supervision and its controllable generative capabilities. These findings underscore the potential of structuring deep generative models to enhance image generation and attribute control without direct supervision with ground truth attributes signaling progress toward more sophisticated deep generative models.
Original language | English (US) |
---|---|
Article number | 1274779 |
Journal | Frontiers in Computer Science |
Volume | 6 |
DOIs | |
State | Published - 2024 |
Keywords
- auto-encoders
- generative models
- geometry
- graph divergence
- manifolds
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Computer Science Applications