Discriminative And-Or Graph Learning
- Liang Lin, Xiaolong Wang, Wei Yang, and JianHuang Lai, Discriminatively Trained And-Or Graph Models for Object Shape Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), DOI: 10.1109/TPAMI.2014.2359888, 2014. [PDF]
- Liang Lin, Xiaolong Wang, Wei Yang, and Jian-Huang Lai, Learning Contour-Fragment-based Shape Model with And-Or Tree Representation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.Paper Dataset Code
This paper investigates a novel reconfigurable part-based model, namely And-Or graph model, to recognize object shapes in images, as Fig. 1 shows.
The proposed model consists of four layers:
- Leaf-nodes: local classifiers for detecting contour fragments;
- Or nodes: switches to activate one of its child leaf-nodes, making the model reconfigurable during inference;
- And-nodes capture holistic shape deformations;
- Root-node is also an or-node, which activates one of its child and-nodes to deal with large global variations (e.g. different poses and views).
We discriminatively train the And-Or model from weakly annotated data by proposing a non-convex optimization algorithm. This algorithm iteratively determines the latent model structures (e.g. the nodes and their layouts) along with the parameter learning.
We validate our model on a new shape database, SYSU-Shapes, as well as other two public databases: UIUCPeople  and INRIA-Horse , and show the superior performances over the state-of-the-art methods.
Experiment I: SYSU-Shape database
Detection accuracy on the SYSU-Shape dataset.
Experiment II: UIUC-People dataset
|Wang et al. ||0.668|
|Andriluka et al. ||0.506|
|Felz et al. ||0.486|
|Bourdev et al. ||0.458|
Comparisons of detection accuracies on the UIUC-people dataset.
Experiment III: INRIA Horse dataset
Experiment IV (Download PPT)
It would be good to generate visualizations to help understand what is being learned by various leaf nodes for various parts. One way to do this is simply visualize image patches across the training data for any given (Or-node, Leaf Node) combination.
For example, suppose you are training a horse detector. Lets say you have a Or-node associated with the head of the horse. The “head” node has various leaves to account for changes in appearance of the head. For each of the leaves, keep a track of training images on which that leaf fires.
Download the code and database
We built a new shape database called SYSU-Shapes, which includes elaborately annotated shape contours. There are 5 categories, i.e. airplanes, boats, cars, motorbikes, and bicycles, and each category contains 200 ∼ 500 images. The shape contours are carefully labeled by using the LabelMe toolkit. It is worth mentioning that each image has at least but not limit to one object of a given category.
- D. Tran and D. Forsyth, Improved human parsing with a full relational model, In Proc. of European Conference on Computer Vision (ECCV), 2010.
- F. Jurie and C. Schmid, Scale-invariant Shape Features for Recognition of Object Categories, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
- Y. Wang, D. Tran, and Z. Liao, Learning Hierarchical Poselets for Human Parsing, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
- M. Andriluka, S. Roth, and B. Schiele, Pictorial structures revisited: People detection and articulated pose estimation, In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
- P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, Object Detection with Discriminatively Trained Part-based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9): 1627-1645, 2010.
- L. Bourdev, S. Maji, T. Brox, and J. Malik, Detecting people using mutually consistent poselet activations, In Proc. of European Conference on Computer Vision (ECCV), 2010.