Human Cyber Physical Intelligence Integration Lab

  • Home
  • News
  • People
  • Projects
  • Publications
  • Resources
  • Join Us
Computer Vision
CV
Multimodal
Multimodal
Robotics
Robotics
  • VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction
    CVPR 2025
  • DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh
    CVPR 2025
  • HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
    CVPR 2025
  • PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention
    CVPR 2025
  • Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation
    CVPR 2025
  • No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition
    CVPR 2025
  • Rethinking Query-based Transformer for Continual Image Segmentation
    CVPR 2025
  • TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts
    Siggraph 2024 & ACM Transactions on Graphics (TOG)
  • NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
    CVPR 2024 Oral paper (Best Paper Candidate)
  • Learning background prompts to discover implicit knowledge for open vocabulary object detection
    CVPR 2024
  • Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
    CVPR 2024
  • 3D Visibility-aware Generalizable Neural Radiance Fields for Interacting Hands
    AAAI 2024
  • AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
    CVPR 2024
  • DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
    TPAMI 2024
  • Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal Approach
    AAAI 2024
  • VidMaestro: Towards Photo-realistic and High-dynamic Video Generations
    Arxiv 2024
  • MLP Can Be A Good Transformer Learner
    CVPR 2024 Oral paper (Best Paper Candidate)
  • Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
    T-PAMI 2023
  • DreamEditor: Text-Driven 3D Scene Editing with Neural Fields
    SIGGRAPH Asia 2023
  • SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
    ICCV 2023
  • Visual Causal Scene Refinement for Video Question Answering
    ACM MM 2023
  • Parametric Linear Blend Skinning Model for Multiple-Shape 3D Garments
    Arxiv 2024
  • ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection
    NIPS 2023
  • Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust Road Extraction
    T-NNLS 2022
  • TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
    T-IP 2022
  • Towards Controllable One-Shot Text-to-image Generation via Contrastive Prompt-Tuning
    arxiv 2022
  • Graph-Convolved Factorization Machines for Personalized Recommendation
    T-KDE 2021
  • Linguistically Routing Capsule Network for Out-of-distribution Visual Question Answering
    ICCV 2021
  • Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning
    T-PAMI 2021
  • Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
    T-NNLS 2021
  • Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
    CVPR 2021
  • Semantics-Aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition
    T-IP 2021
  • A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning
    T-PAMI 2021
  • Deep CockTail Networks: A Universal Framework for Visual Multi-source Domain Adaptation
    IJCV 2021
  • Solving Inefficiency of Self-supervised Representation Learning
    ICCV 2021
  • Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift
    ICCV 2021
  • Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp
    ICRA 2021
  • Structured Attention Network for Referring Image Segmentation
    T-MM 2021
  • Relationship-Embedded Representation Learning for Grounding Referring Expressions
    T-PAMI 2021
  • SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving
    NeurIPS 2021 Datasets and Benchmarks Track
  • Bidirectional Graph Reasoning Network for Panoptic Segmentation
    CVPR 2020
  • EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning
    ECCV 2020
  • Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective
    TPAMI 2020
  • An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation
    AAAI 2020
  • Physical-Virtual Collaboration Modeling for Intra- and Inter-Station Metro Ridership Prediction
    T-ITS 2020
  • Dynamic Spatial-Temporal Representation Learning for Traffic Flow Prediction
    T-ITS 2020
  • Grammatically Recognizing Images with Tree Convolution
    KDD 2020
  • Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer
    T-PAMI 2020
  • Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition
    T-PAMI 2020
  • 3D Human Pose Machines with Self-supervised Learning”. To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence
    TPAMI 2019
  • SNAS: Stochastic Neural Architecture Search
    ICLR 2019
  • FRAME Revisited: An Interpretation View Based on Particle Evolution
    AAAI 2019
  • Taxi Origin-Destination Demand Prediction with Contextualized Spatial-Temporal Network
    T-ITS 2019
  • Crowd Counting with Deep Structured Scale Integration Network
    ICCV 2019
  • Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation
    ICCV 2019
  • Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
    ICML 2019
  • Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining” ACM International Conference on Multimedia
    ACM MM 2018
  • Flow Guided Recurrent Neural Encoder for Video Salient Object Detection
    CVPR 2018
  • Fine-Grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding
    ACM MM 2018
  • Leaning to Segment Object Proposals via Recursive Neural Networks
    TIP 2018
  • Hierarchical Scene Parsing by Weakly Supervised Learning with Image Descriptions
    TPAMI 2018
  • Towards Human-Machine Cooperation: Self-supervised Sample Mining for Object Detection
    CVPR 2018
  • Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift
    CVPR 2018
  • Learning a Wavelet-like Auto-Encoder to Accelerate Deep Neural Networks
    AAAI 2018 Oral
  • Cost-Effective Object Detection: Active Sample Mining with Switchable Selection Criteria
    T-NNLS 2018
  • Convolutional Memory Blocks for Depth Data Representation Learning
    IJCAI 2018
  • Deep Co-Space: Sample Mining Across Feature Transformation for Semi-Supervised Learning
    TCSVT 2017
  • Interpretable Structure-Evolving LSTM
    CVPR 2017
  • Active Self-Paced Learning for Cost-Effective and Progressive Face Identification
    TPAMI 2017
  • Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning
    TPAMI 2016
  • SOLD: Sub-Optimal Low-Rank Decomposition for Efficient Video Segmentation
    CVPR 2015
  • Discriminatively Trained And-Or Graph Models for Object Shape Detection
    TPAMI 2014
  • News
    • Achievements
    • Activities
    • sharings
    • Talks
  • People
    • Faculty
    • Students
    • Alumni
  • Projects
    • Computer Vision
    • Multimodal
    • Robotics
  • Links
    • Git-Lab
中山大学人机物智能融合实验室 Human Cyber Physical Intelligence Integration Lab
  • hcp@sysu.edu.cn
  • 广州市广州大学城外环东路132号
Official Account
News
Achievements
Activities
sharings
Talks
People
Faculty
Students
Alumni
Projects
Computer Vision
Multimodal
Robotics
Links
Git-Lab
©2025 HCP in SYSU  粤ICP备2021037607号
©2025 HCP in SYSU 粤ICP备2021037607号