Data Science Lab
HOMERESEARCHPEOPLEPUBLICATIONSSEMINAR

Image Captioners Are Scalable Vision Learners Too

Feb 18, 2025 About 1 min
#Multimodal #Vision-Language Pretraining
Multimodal Procedural Planning via Dual Text-Image...ReAct: Synergizing Reasoning and Acting in...

15588 경기도 안산시 상록구 한양대학로 55 (사동) 제 4공학관 408-1호


55, Hanyangdaehak-ro, Sangnok-gu, Ansan-si, Gyenggi-do