Data Science Lab
HOME
RESEARCH
PEOPLE
PUBLICATIONS
SEMINAR
Light
Dark
Image Captioners Are Scalable Vision Learners Too
Feb 18, 2025
About 1 min
#Multimodal
#Vision-Language Pretraining
Multimodal Procedural Planning via Dual Text-Image...
ReAct: Synergizing Reasoning and Acting in...