Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift
- Ruijia Xu, Ziliang Chen(co-first author), Wangmeng Zuo, Junjie Yan and Liang Lin*. Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018) PDF Digit Five Code
Unsupervised domain adaptation (UDA) conventionally assumes labeled source samples coming from a single underlying source distribution. Whereas in practical scenario, labeled data are typically collected from diverse sources. The multiple sources are different not only from the target but also from each other, thus, domain adaptater should not be modeled in the same way. Moreover, those sources may not completely share their categories, which further brings a category shift challenge to multi-source (unsupervised) domain adaptation. In this paper, we propose a deep cocktail network (DCTN) to battle the domain and category shifts among multiple sources. Motivated by the distribution weighted combining rule, the target distribution can be represented as the weighted combination of source distributions, and, the training of MDA via DCTN is then performed as two alternating steps: i) It deploys multi-way adversarial learning to minimize the discrepancy between the target and each of the multiple source domains, which also obtains the source-specific perplexity scores to denote the possibilities that a target sample belongs to different source domains. ii) The multi-source category classifiers are integrated with the perplexity scores to classify target sample, and the pseudo-labeled target samples together with source samples are utilized to update the multi-source category classifier and the feature extractor. We evaluate DCTN in three domain adaptation benchmarks, which clearly demonstrate the superiority of our framework.
An overview of the proposed Deep Cocktail Network (DCTN). Our framework receives multi-source instances with annotated ground truth and adapts to classify the target samples. Let’s consider the source j and k for simplicity. i) The feature extractor maps target, source j and k into a common feature space. ii) The category classifier receives target feature and produces the j-th and k-th classifications based upon the categories in source j and k respectively. iii) The domain discriminator receives features from source j, k and target, then offers the k-th advesary between target and source k, as well as the j-th advesary between target and source j. The j-th and k-th advesary provide source j and k perplexity scores to weight the j-th and k-th classifications correspondingly. iv) The target classification operator integrates all weighted classification results then predicts the target class across category shifts.
Evaluation under the vanilla setting
Evaluation under the category shift setting
 Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation with multiple sources. In Advances in neural information processing systems, pages 1041–1048, 2009.
 M. Long, Y. Cao, J. Wang, and M. Jordan. Learning transferable features with deep adaptation networks. In International Conference on Machine Learning, pages 97–105, 2015.
 P. P. Busto and J. Gall. Open set domain adaptation. In The IEEE International Conference on Computer Vision (ICCV), volume 1, 2017.
 Y. Ganin and V. Lempitsky. Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning, pages 1180–1189, 2015.
 J. Xie, W. Hu, S.-C. Zhu, and Y. N. Wu. Learning sparse frame models for natural image patterns. International Journal of Computer Vision, 114(2-3):91–112, 2015.