Follow
Shanghang Zhang
Shanghang Zhang
Verified email at eecs.berkeley.edu - Homepage
Title
Cited by
Year
Multimodal Large Language Models for Bioimage Analysis
S Zhang, G Dai, T Huang, J Chen
arXiv preprint arXiv:2407.19778, 2024
2024
MAVIS: Mathematical Visual Instruction Tuning
R Zhang, X Wei, D Jiang, Y Zhang, Z Guo, C Tong, J Liu, A Zhou, B Wei, ...
arXiv preprint arXiv:2407.08739, 2024
2024
Fisher-aware Quantization for DETR Detectors with Critical-category Objectives
H Yang, Y Huang, Z Dong, DA Gudovskiy, T Okuno, Y Nakata, Y Du, ...
arXiv preprint arXiv:2407.03442, 2024
2024
MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
G Wang, X Wei, J Liu, R Zhang, Y Zhang, K Zhang, M Chong, S Zhang
arXiv preprint arXiv:2406.15768, 2024
2024
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation
J Liu, M Liu, Z Wang, L Lee, K Zhou, P An, S Yang, R Zhang, Y Guo, ...
arXiv preprint arXiv:2406.04339, 2024
22024
Gaussian: Self-Supervised Street Gaussians for Autonomous Driving
N Huang, X Wei, W Zheng, P An, M Lu, W Zhan, M Tomizuka, K Keutzer, ...
arXiv preprint arXiv:2405.20323, 2024
2024
Implicit Neural Image Field for Biological Microscopy Image Compression
G Dai, CC Tseng, Q Wuwu, R Zhang, S Wang, M Lu, T Huang, Y Zhou, ...
arXiv preprint arXiv:2405.19012, 2024
2024
Compositional Few-Shot Class-Incremental Learning
Y Zou, S Zhang, H Zhou, Y Li, R Li
arXiv preprint arXiv:2405.17022, 2024
12024
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
X Qi, H Zhang, Y Wang, J Pan, C Liu, P Li, X Chi, M Li, Q Zhang, W Xue, ...
arXiv preprint arXiv:2405.16874, 2024
2024
Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation
J Liu, C Li, G Wang, L Lee, K Zhou, S Chen, C Xiong, J Ge, R Zhang, ...
arXiv preprint arXiv:2405.17418, 2024
22024
Unveiling the Tapestry of Consistency in Large Vision-Language Models
Y Zhang, F Xiao, T Huang, CK Fan, H Dong, J Li, J Wang, K Cheng, ...
arXiv preprint arXiv:2405.14156, 2024
2024
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
P Li, Y Liu, X Long, F Zhang, C Lin, M Li, X Qi, S Zhang, W Luo, P Tan, ...
arXiv preprint arXiv:2405.11616, 2024
2024
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model
Y Luo, R An, B Zou, Y Tang, J Liu, S Zhang
arXiv preprint arXiv:2405.02363, 2024
12024
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning
Y Liu, R Zhang, H Yang, K Keutzer, Y Du, L Du, S Zhang
arXiv preprint arXiv:2404.08985, 2024
32024
A multimodal physiological dataset for driving behaviour analysis
X Tao, D Gao, W Zhang, T Liu, B Du, S Zhang, Y Qin
Nature Scientific data 11 (1), 378, 2024
12024
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
Y Tang, J Liu, D Wang, Z Wang, S Zhang, B Zhao, X Li
arXiv preprint arXiv:2404.07989, 2024
12024
SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera
G Dai, Z Wang, Q Xu, W Cheng, M Lu, B Shi, S Zhang, T Huang
arXiv preprint arXiv:2404.06710, 2024
2024
Exploring generalizable distillation for efficient medical image segmentation
X Qi, Z Wu, W Zou, M Ren, Y Gao, M Sun, S Zhang, C Shan, Z Sun
IEEE Journal of Biomedical and Health Informatics, 2024
12024
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
W Lin, X Wei, R An, P Gao, B Zou, Y Luo, S Huang, S Zhang, H Li
arXiv preprint arXiv:2403.20271, 2024
22024
Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection
H Gao, Z Chen, Z Chen, L Chen, J Liu, S Zhang, F Zhao
Proceedings of the AAAI Conference on Artificial Intelligence 38 (3), 1797-1805, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20