TIP 2015
PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence.
Keze Wang, Liang Lin, Jiangbo Lu, Chenglong Li, Keyang Shi
TIP 2015


Driven by recent vision and graphics applications such as image segmentation and object recognition, computing pixel-accurate saliency values to uniformly highlight foreground objects becomes increasingly critical. In this paper, we propose a unified framework called PISA — Pixelwise Image Saliency Aggregating various bottom-up cues and priors to generate spatially coherent yet detail-preserving pixel-accurate and fine-grained saliency. It overcomes the limitations of previous methods which use homogeneous superpixel-based and color only treatment. PISA aggregates multiple saliency cues in a global context such as complementary color and structure contrast measures with their spatial priors in image domain. The saliency confidence are further jointly modeled with the neighborhood consistence constraint into a energy minimization formulation, in which each pixel will be assigned with multiple saliency levels. Instead of using discrete labeling approaches, we employ the cost-volume filtering technique to solve our formulation, which fuses the saliency values within an adaptive observation while preserving the edge-aware details. In addition, a faster version of PISA is developed using the gradient-driven image sub-sampling to greatly improve the running efficiency while keeping comparable accuracy. Extensive experiments on the six public datasets demonstrate the superior performance of PISA over the state-of-the-art approaches.


Figure. (a) Original image (b) Color-based contrast measure, which detects homogeneous saliency regions better (c) Structure-based contrast measure, which detects structural saliency regions better. (d) PISA results. By aggregating the detection generated in (b) and (c), PISA achieves a thorough detection.



PISA Workflow




F-PISA–a fast implementation with 44 millisecond per image


Instead of processing the full image grid, we perform a gradient-driven subsampling of the input image I, and apply the saliency computation to this set of selected pixels. This significantly improves the efficiency (14 times faster than PISA, 44ms per image with Intel i7 2600 3.4GHz CPU), while keeping good detection effectiveness.



Visual Comparisons with state of the art methods


Figure. Visual comparison between existing methods and our PISA and F-PISA methods on all the three datasets: ASD [1] (top two rows), SOD [7] (next two rows), SED1 [8] (next two rows), ECSSD [9] (next two rows), PASCAL-1500 [10] (next two rows), TCD [11] (bottom two rows). Here SRC means original image, and GT means ground truth.



PISA/F-PISA result on six datasets

ASD [1]: SOD [7]: SED1 [8]: ECSSD [9]: PASCAL-1500 [10]: TCD [11]:
Local Drive Local Drive Local Drive Local Drive Local Drive Local Drive
Cloud Drive Cloud Drive Cloud Drive Cloud Drive Cloud Drive Cloud Drive







  1. FT,ASD — R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk.Frequency-tuned salient region detection,” in IEEE CVPR, 2009, pp. 1597–1604.
  2. HC,RC — M. Cheng, G. Zhang, N. Mitra, X. Huang, S. Hu. Global contrast based salient region detection. In CVPR, 2011.
  3. SF — F. Perazzi, P. Krahenbuhl, Y. Pritch, A. Hornung. Saliency filters: contrast based filtering for salient region detection. In CVPR, 2012.
  4. SR — X. Hou and L. Zhang, “Saliency detection: A spectral residual
    ,” in IEEE CVPR, 2007, pp. 1–8.
  5. CA — S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-aware saliency detection,” in IEEE CVPR, 2010, pp. 2376–2383.
  6. LC — Y. Zhai and M. Shah, “Visual attention detection in video sequences using spatiotemporal cues,” in ACM Multimedia, 2006, pp. 815–824.
  7. SOD — V. Movahedi, J. Elder. Design and perceptual validation of performance measures for salient object segmentation. In POCV, 2010.
  8. SED1 — S. Alpert, M. Galun, R. Basri, A. Brandt. Image segmenta-tion by probabilistic bottom-up aggregation and cue integration. In CVPR, 2007.
  9. ECSSD — Extended Complex Scene Saliency Dataset (ECSSD).
  10. PASCAL-1500 — W. Zou, K. Kpalma, Z. Liu, and J. Ronsin. Segmentation driven low-rank matrix recovery for saliency detection.. In Proc. Brit. Mach. Vis. Conf., pp. 1-13, 2013.
  11. TCD — Taobao Commodity dataset.
  12. J. Kopf, M. Cohen, D. Lischinski, M. Uyttendaele. Joint bilateral upsampling. ACM TOG, 2007.
  13. HS — Q. Yan, L. Xu, J. Shi, and J. Jia. Hierarchical saliency detection.. In CVPR, pp. 1155-1162, 2013.
  14. MC — B. Jiang, L. Zhang, H. Lu, and C. Yang.Saliency detection via absorbing Markov chain.. In ICCV, pp. 1665-1672, 2013.
  15. DSR — X. Li, H. Lu, L. Zhang, X. Ruan, and M. Yang. Saliency detection via dense and sparse reconstruction.. In ICCV, pp. 2976-2983, 2013.
  16. GM — C. Yang, L. Zhang, H. Lu, and X. Ruan. Saliency Detection via Graph-based Manifold Ranking.. In CVPR, pp.3166-3173, 2013.
  17. GC — M. Cheng, J. Warrell, W. Lin, S. Zheng, V. Vineet, and N. Crook.Efficient salient region detection with soft image abstraction.. In ICCV, pp. 1529-1536, 2013.
  18. CB — H. Jiang, J. Wang, Z. Yuan, T. Liu, and N. Zheng.Automatic salient object segmentation based on context and shape prior.. In BMVC, pp. 1101-1112, 2011.