Yiwei Ma 马祎炜

Cited by

	All	Since 2019
Citations	380	380
h-index	7	7
i10-index	6	6

240

120

180

20222023202412 128 240

Public access

View all

2 articles

available

not available

Based on funding mandates

Co-authors

Rongrong Ji 纪荣嵘Professor, Xiamen UniversityVerified email at xmu.edu.cn
Xiaoshuai Sun 孙晓帅Professor, Xiamen UniversityVerified email at xmu.edu.cn
Jiayi Ji厦门大学（Xiamen University, XMU）& 新加坡国立大学（NUS）Verified email at xmu.edu.cn

Yiwei Ma 马祎炜

PhD Student, Xiamen University

Verified email at stu.xmu.edu.cn - Homepage

Multimedia Image Captioning Video-Text Retrieval (2/3D) Vision and Language


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
X-clip: End-to-end multi-grained contrastive learning for video-text retrieval Y Ma, G Xu, X Sun, M Yan, J Zhang, R Ji Proceedings of the 30th ACM International Conference on Multimedia (ACM MM …, 2022	193	2022
Towards local visual modeling for image captioning Y Ma, J Ji, X Sun, Y Zhou, R Ji Pattern Recognition (PR) 138, 109420, 2023	45	2023
Knowing what to learn: a metric-oriented focal mechanism for image captioning J Ji, Y Ma, X Sun, Y Zhou, Y Wu, R Ji IEEE Transactions on Image Processing 31, 4321-4335, 2022	32	2022
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance Y Ma, X Zhang, X Sun, J Ji, H Wang, G Jiang, W Zhuang, R Ji Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	31	2023
Knowing what it is: semantic-enhanced dual attention transformer Y Ma, J Ji, X Sun, Y Zhou, Y Wu, F Huang, R Ji IEEE Transactions on Multimedia (IEEE TMM), 2022	19	2022
Beyond first impressions: Integrating joint multi-modal cues for comprehensive 3d representation H Wang, J Tang, J Ji, X Sun, R Zhang, Y Ma, M Zhao, L Li, Z Zhao, T Lv, ... Proceedings of the 31st ACM International Conference on Multimedia, 3403-3414, 2023	10	2023
X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks Z Qian, Y Ma, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (5), 4551-4559, 2024	7	2024
Rotated multi-scale interaction network for referring remote sensing image segmentation S Liu, Y Ma, X Zhang, H Wang, J Ji, X Sun, R Ji Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	7	2024
X-dreamer: Creating high-quality 3d content by bridging the domain gap between text-to-2d and text-to-3d generation Y Ma, Y Fan, J Ji, H Wang, X Sun, G Jiang, A Shu, R Ji ACM Transactions on Multimedia Computing, Communications and Applications (ToMM), 2023	7	2023
Semi-supervised panoptic narrative grounding D Yang, J Ji, X Sun, H Wang, Y Li, Y Ma, R Ji Proceedings of the 31st ACM International Conference on Multimedia, 7164-7174, 2023	7	2023
Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval Y Ma, X Sun, J Ji, G Jiang, W Zhuang, R Ji Proceedings of the 31st ACM International Conference on Multimedia (ACM MM …, 2023	7	2023
3d-stmn: Dependency-driven superpoint-text matching network for end-to-end 3d referring expression segmentation C Wu, Y Ma, Q Chen, H Wang, G Luo, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5940-5948, 2024	6	2024
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation D Yang, J Ji, Y Ma, T Guo, H Wang, X Sun, R Ji arXiv preprint arXiv:2406.01451, 2024	3	2024
Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation T Guo, H Wang, Y Ma, J Ji, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 38 (3), 1985-1993, 2024	3	2024
3D-GRES: Generalized 3D Referring Expression Segmentation C Wu, Y Liu, J Ji, Y Ma, H Wang, G Luo, H Ding, X Sun, R Ji ACM International Conference on Multimedia (ACM MM), 2024	1	2024
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation Y Ma, Z Lin, J Ji, Y Fan, X Sun, R Ji International Conference on Machine Learning (ICML), 2024	1	2024
JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues J Ji, H Wang, C Wu, Y Ma, X Sun, R Ji arXiv preprint arXiv:2310.09503, 2023	1	2023
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing Y Ma, J Ji, K Ye, W Lin, Z Wang, Y Zheng, Q Zhou, X Sun, R Ji arXiv preprint arXiv:2408.14180, 2024		2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model Y Ma, Z Wang, X Sun, W Lin, Q Zhou, J Ji, R Ji arXiv preprint arXiv:2407.16198, 2024		2024
Multi-branch Collaborative Learning Network for 3D Visual Grounding Z Qian, Y Ma, Z Lin, J Ji, X Zheng, X Sun, R Ji European Conference on Computer Vision (ECCV), 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors