Object Detection with
Discriminative And-Or Graphs
We proposes a reconfigurable model to recognize and detect multiclass (or multiview) objects with large variation in appearance. Compared with well acknowledged hierarchical models, we study two advanced capabilities in hierarchy for object modeling: (i)“switch” variables(i.e. or-nodes) for specifying alternative compositions, and (ii) making local classifiers (i.e. leaf-nodes) shared among different classes. These capabilities enable us to account well for structural variabilities while preserving the model compact.
Our model, in the form of an And-Or Graph, comprises four layers: a batch of leaf-nodes with collaborative edges in bottom for localizing object parts; the or-nodes over bottom to activate their children leaf-nodes; the and-nodes to classify objects as a whole; one root-node on the top for switching multiclass classification, which is also an or-node. For model training, we present an EM-type algorithm, namely dynamical structural optimization (DSO), to iteratively determine the structural configuration, (e.g., leaf-node generation associated with their parent or-nodes and shared across other classes), along with optimizing multi-layer parameters. The proposed method is valid on challenging databases, e.g., PASCAL VOC 2007 and UIUC-People, and it achieves state-of-the-arts performance.
The main properties of our approach can be highlighted as:
- Model reconfigurability: Inspired by And-Or graph models in [2,3], we develop the “switch variables”, namely or-nodes, to specify alternative compositions in hierarchy. It worths mentioning that the association of or-nodes with its children leaf-nodes can be automatically determined in model training. In Fig.1, the sheep head is localized by the leaf-node that is activated by its parent or-node.
- Model sharing.: The leaf-nodes are sharable among different classes so that we keep the model compact to represent multiple object categories. For example, in Fig.1, the part of feet in category horse and sheep have similar appearances, and thus can be both detected by the leaf-node shared across the two classes.
Dynamical Structural Optimization(DSO)
This learning algorithm is an EM-type procedure that incorporating structure reconfiguration and parameter estimation. It is extended from the CCCP procedure. During each iteration, our algorithm dynamically create and remove leaf-nodes associated with their parent or-nodes, and share leaf-nodes among classes.
We evaluate our method on two challenging datasets: UIUC people and PASCAL VOC 2007.
We show the results of detection as follows.
We also evaluate the benefits of sharing leaf-nodes.
 Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection. X. Wang, L. Lin, L. Huang, and S. Yan in CVPR 2013
 Latent Hierarchical Structural Learning for Object Detection. L. Zhu, Y. Chen, Y. Lu, C. Lin, and A. Yuille, in CVPR 2010
 Learning hierarchical poselets for human parsing. Y. Wang, D. Tran, and Z. Liao, in CVPR 2011
 The concave-convex procedure(cccp). A. Yuille and A. Rangarajan, in NIPS 2001