Rtformer: Efficient design for real-time semantic segmentation with transformer J Wang*, C Gou*, Q Wu, H Feng, J Han, E Ding, J Wang NeurIPS 2022 Spotlight 35, 7423-7436, 2022 | 74 | 2022 |
DrVideo: Document Retrieval Based Long Video Understanding Z Ma*, C Gou*, H Shi, B Sun, S Li, H Rezatofighi, J Cai arXiv preprint arXiv:2406.12846, 2024 | 2 | 2024 |
Strong and Controllable Blind Image Decomposition Z Zhang*, J Han*, C Gou*, H Li, L Zheng arXiv preprint arXiv:2403.10520, 2024 | 1 | 2024 |
EZIGen: Enhancing zero-shot subject-driven image generation with precise subject encoding and decoupled guidance Z Duan, Y Ding, C Gou, Z Zhou, E Smith, L Liu arXiv preprint arXiv:2409.08091, 2024 | | 2024 |
How Well Can Vision Language Models See Image Details? C Gou, A Felemban, FF Khan, D Zhu, J Cai, H Rezatofighi, M Elhoseiny arXiv preprint arXiv:2408.03940, 2024 | | 2024 |
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding K Ataallah, C Gou, E Abdelrahman, K Pahwa, J Ding, M Elhoseiny arXiv preprint arXiv:2406.19875, 2024 | | 2024 |
JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments DT Le*, C Gou*, S Datta, H Shi, I Reid, J Cai, H Rezatofighi IEEE Conference on Computer Vision and Pattern Recognition (CVPR24), 2024 | | 2024 |