Paper
Xu Cai, Yang Wu, Guanbin Li*, Ziliang Chen, Liang Lin, “FRAME Revisited: An Interpretation View Based on Particle Evolution ”, Proc. of AAAI Conference on Artificial Intelligence (AAAI), 2019.(camera ready) Slides Code Paper
Abstract
FRAME (Filters, Random fields, And Maximum Entropy)[1] is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals. The maximum likelihood estimation (MLE) is applied by default, yet conventionally causes the unstable training energy that wrecks the generated structures, which remains unexplained. In this paper, we provide a new theoretical insight to analyze FRAME, from a perspective of particle physics ascribing the weird phenomenon to KL-vanishing issue. In order to stabilize the energy dissipation, we propose an alternative Wasserstein distance in discrete time based on the conclusion that the Jordan-Kinderlehrer-Otto (JKO)[2] discrete flow approximates KL discrete flow when the time step size tends to 0. Besides, this metric can still maintain the model’s statistical consistency. Quantitative and qualitative experiments have been respectively conducted on several widely used datasets. The empirical studies have evidenced the effectiveness and superiority of our method.
Motivations
Model Collapse Identification
This confirmation of existance of model collapse is implemented on a subset of SUN[5].
Quantitive and Qualitive Experiments Results
On large datasets CelebA[3] and LSUN-Bedroom[4]
And the quantitive results of those three datasets above, The table is the inception scores of comapred generative models on dataset CIFAR-10.
Reference
[1] Zhu, S. C.; Wu, Y. N.; and Mumford, D. 1997. Minimax entropy principle and its application to texture modeling. Neural computation 9(8):1627–1660.
[2] Jordan, R.; Kinderlehrer, D.; and Otto, F. 1998. The vari-ational formulation of the fokker–planck equation.SIAMjournal on mathematical analysis29(1):1–17.
[3] Liu, Z.; Luo, P.; Wang, X.; and Tang, X. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, 3730–3738.
[4] Yu, F.; Zhang, Y.; Song, S.; Seff, A.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.
[5] Xiao, J.; Hays, J.; Ehinger, K. A.; Oliva, A.; and Torralba, A. 2010. Sun database: Large-scale scene recognition from abbey to zoo. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, 3485–3492. IEEE.