Data Science Lab
HOMERESEARCHPEOPLEPUBLICATIONSSEMINAR

    2025

  • Jun 02, 2025 Harnessing the Universal Geometry of Embeddings #Text Embedding
  • May 26, 2025 RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement #LLM #RAG
  • May 19, 2025 World models #World model
  • May 12, 2025 K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor #LLM #RAG
  • May 07, 2025 Diffusion Feedback Helps CLIP See Better #MLLM #Diffusion
  • Apr 28, 2025 Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs #Retriever
  • Apr 23, 2025 Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs #MLLM
  • Apr 14, 2025 Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation #Diffusion
  • Apr 07, 2025 RouteLLM: Learning to Route LLMs with Preference Data #LLM Routing
  • Mar 31, 2025 ARES : Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback #Multi-Modal #Reinforcement Learning
  • Mar 24, 2025 MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs #Multi-Modal #MLLM
  • Mar 17, 2025 The Hyperfitting Phenomenon:Sharpening and Stabilizing LLMs for Open-Ended Text Generation #LLM
  • Mar 10, 2025 Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models #3D Motion #Diffusion
  • Mar 04, 2025 Mixture-of-Agents Enhances Large Languague Model Capabilities #Agent #LLM
  • Feb 25, 2025 Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation #LLM #RAG
  • Feb 25, 2025 Chain-of-Thought Reasoning Without Prompting #Decoding
  • Feb 18, 2025 ReAct: Synergizing Reasoning and Acting in Language Models #Agent #NLP
  • Feb 18, 2025 Image Captioners Are Scalable Vision Learners Too #Multimodal #Vision-Language Pretraining
  • Jan 07, 2025 Multimodal Procedural Planning via Dual Text-Image Prompting #Multimodal
  • 2024

  • Dec 31, 2024 Retrieval Augmented Geneartion or Long-Context LLMs? A Comprehensive Study and Hybrid Approach #RAG #LLM
  • Dec 24, 2024 VisualWebArena: Evaluating Multimodal Agents on Realistic Visually Grounded Web Tasks #Agent
  • Dec 17, 2024 Guiding a Diffusion Model with a Bad Version of Itself #Image generation
  • Dec 10, 2024 The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation #Persuasive Misinformation #LLM
  • Dec 02, 2024 UniIR: Training and Benchmarking Universal Multimodal Information Retrievers #Retrieval
  • Nov 19, 2024 Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration #Hallucination #LLM
  • Nov 07, 2024 Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models #VLM #Representation Learning
  • Nov 05, 2024 Proving Test Set Contamination in Black-Box Language Models #Test Set Contamination
  • Oct 29, 2024 DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models #RAG #LLM
  • Oct 22, 2024 Iterated Learning Improves Compositionality in Large Vision-Language Models #VLM
  • Oct 16, 2024 What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning #Data Selection #LLM #Instruction Tuning
  • Oct 02, 2024 SELF-REFINE: Iterative Refinement with Self-Feedback #Hallucination #LLM
  • Sep 25, 2024 API-Assisted Code Generation for Question Answering on Varied Table Structures #Tabular data
  • Sep 11, 2024 LLaVA-OneVision: Easy Visual Task Transfer #Multimodal
  • Sep 04, 2024 Improving Text Embeddings with Large Language Models #Text Embedding
  • Aug 27, 2024 Faster Minimum Bayes Risk Decoding with Confidence-based Pruning #MBR
  • Aug 21, 2024 Images Speak in Images: A Generalist Painter for In-Context Visual Learning #Image generation #ViT
  • Aug 20, 2024 Merging Generated and Retrieved Knowledge for Open-Domain QA #RAG #LLM
  • Aug 13, 2024 Lost in the Middle: How Language Models Use Long Contexts #Long-Context #Question Answering #Information Retrieval
  • Aug 06, 2024 G-EVAL: NLG Evaluation using GPT-4 with Better Human Alignment #Evaluation
  • Jul 31, 2024 Null-text Inversion for Editing Real Images using Guided Diffusion Models #Image Editing
  • Jul 30, 2024 E5: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate #Tabular data
  • Jul 23, 2024 Text Embeddings Reveal (Almost) As Much As Text #Embedding Inversion #Text Embedding
  • Jul 17, 2024 ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts #Multimodal
  • Jul 03, 2024 Segment Anything #Image-Segmentation
  • Jul 02, 2024 Are Emergent Abilities of Large Language Models a Mirage? #Emergent Abilities
  • Jun 25, 2024 Learning to Retrieve In-Context Examples for Large Language Models #RAG #LLM
  • Jun 19, 2024 Exploring Simple Siamese Representation Learning #Representation Learning
  • Jun 16, 2024 LIMA: Less Is More for Alignment #LLM #Instruction Tuning #Chat Assistant
  • Jun 05, 2024 Visual Instruction Tuning #Multimodal
  • May 22, 2024 Time to Shine: Fine-Tuning Object Detection Models with Synthetic Adverse Weather Images #Object-Detection #Image generation
  • May 21, 2024 The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models #Hallucination #LLM
  • May 14, 2024 CABINET: Content Relevance based Noise Reduction for Table Question Answering #Tabular data
  • May 08, 2024 Best of Both Worlds: Learning Arbitrary-scale Blind Super-Resolution via Dual Degradation Representations and Cycle-Consistency #Super-Resolution
  • May 01, 2024 Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering #VQA
  • Apr 30, 2024 Look-back Decoding for Open-Ended Text Generation #Decoding
  • Apr 23, 2024 SLIDE: Reference-free Evaluation for Machine Translation using a Sliding Document Window #Evaluation
  • Apr 17, 2024 SSSD: Self-Supervised Self Distillation #Distillation
  • Apr 16, 2024 Self-Knowledge Guided Retrieval Augmentation for Large Language Models #RAG #LLM
  • Apr 09, 2024 RECOMP: Improving Retrieval-augmented LMs with Compression and Selective Augmentation #RAG #Question Answering #Summarization
  • Apr 02, 2024 Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy #RAG
  • Mar 27, 2024 Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding #Image generation
  • Mar 26, 2024 Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection #RAG
  • Mar 13, 2024 DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting #Object-Detection
  • Mar 12, 2024 In-Context Retrieval-Augmented Language Models #RAG #Information Retrieval
  • Mar 06, 2024 Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models #QA
  • Feb 27, 2024 Understanding Dark Scenes by Contrasting Multi-Modal Observations #Image-Segmentation
  • Feb 27, 2024 Active Retrieval Augmented Generation #RAG #LLM
  • Feb 27, 2024 BEIT: BERT Pre-Training of Image Transformers #BERT #ViT
  • Feb 16, 2024 Generate Rather than Retrieve: Large Language Models are Strong Context Generators #Prompt Learning #LLM #Question Answering
  • Feb 13, 2024 ARNIQA: Learning Distortion Manifold for Image Quality Assessment #IQA
  • Feb 06, 2024 CRITIC: LARGE LANGUAGE MODELS CAN SELFCORRECT WITH TOOL-INTERACTIVE CRITIQUING #Hallucination #LLM
  • Jan 30, 2024 Reasoning with Language Model is Planning with World Model #planning #LLM
  • Jan 16, 2024 Extrinsic Evaluation of Machine Translation Metrics #Evaluation
  • Jan 02, 2024 SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model #Image Inpainting
  • 2023

  • Dec 27, 2023 Shape-biased CNNs are Not Always Superior in Out-of-Distribution Robustness #Representation
  • Dec 27, 2023 HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models #Hallucination #LLM
  • Dec 13, 2023 Oriented R-CNN for Object Detection #RCNN #CV
  • Dec 13, 2023 Scaling Instruction-Finetuned Language Models #Instruction Tuning #LLM
  • Nov 22, 2023 Atlas: Few-shot Learning with Retrieval Augmented Language Models #RAG #Information Retrieval
  • Nov 15, 2023 Evaluation Metrics for Text Generation (BERTScore, BARTScore) #Evaluation
  • Nov 15, 2023 Robust Speech Recognition via Large-Scale Weak Supervision #Representation
  • Nov 08, 2023 Rethinking Fast Fourier Convolution in Image Inpainting #Image Inpainting
  • Nov 08, 2023 Deep Preset: Blending and Retouching Photos with Color Style Transfer #Style Transfer #CV
  • Oct 25, 2023 Soft Augmentation for Image Classification #Image Classification #CV
  • Oct 25, 2023 LLM BLENDER: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion #Ensemble
  • Oct 18, 2023 Predicting Numerals in Text Using Nearest Neighbor Language Models #Numerical Data
  • Oct 11, 2023 Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker #Reasoning #LLM
  • Oct 11, 2023 ImageNet Pre-training also Transfers Non-robustness #Representation
  • Sep 27, 2023 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis #3D Rendering
  • Sep 20, 2023 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks #RAG #Information Retrieval
  • Sep 13, 2023 Causes and Cures for Interference in Multilingual Translation #Machine Translation
  • Sep 13, 2023 Neural Preset for Color Style Transfer #Style Transfer #CV
  • Sep 06, 2023 Do Androids Laugh at Electric Sheep? Humor “Understanding” Benchmarks from The New Yorker Caption Contest #LLM
  • Aug 28, 2023 Mask-guided Matting in the Wild #Matting
  • Aug 21, 2023 Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture #Self-supervised
  • Aug 21, 2023 Generative Agents: Interactive Simulacra of Human Behavior #LLM #Agent
  • Aug 14, 2023 Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems #Dialogue State Tracking #Dialogue System
  • Aug 14, 2023 DESIGNING BERT FOR CONVOLUTIONAL NETWORKS: SPARSE AND HIERARCHICAL MASKED MODELING #CNN #CV
  • Aug 07, 2023 P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks #PEFT
  • Jul 31, 2023 Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring #Deblurring
  • Jul 31, 2023 Deep Frequency Filtering for Domain Generalization #Domain Generalization
  • Jul 24, 2023 RLPROMPT: Optimizing Discrete Text Prompts with Reinforcement Learning #Reinforcement Learning
  • Jul 23, 2023 Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations #Representation
  • Jul 17, 2023 Detecting and Mitigating Hallucinations in Machine Translation #Machine Translation
  • Jul 10, 2023 Self-Instruct: Aligning Language Models with Self-Generated Instructions #Instruction Tuning #LLM
  • Jul 03, 2023 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? #In-Context Learning
  • Jun 26, 2023 Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models #Text Style Transfer
  • Jun 26, 2023 Towards Open World Object Detection #Object-Detection
  • Jun 09, 2023 CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying #Image Inpainting
  • Jun 02, 2023 Proposal-Contrastive Pretraining for Object Detection from Fewer Data #Representation
  • May 31, 2023 LoRA: Low-Rank Adaptation of Large Language Models #LoRA #PEFT
  • May 12, 2023 Training Independent Subnetworks for Robust Prediction #MIMO #Uncertainty
  • May 10, 2023 A Recipe For Arbitrary Text Style Transfer with Large Language Models #Text Style Transfer #LLM
  • Apr 21, 2023 SiamMOT:Siamese Multi-Object Tracking #Tracking
  • Apr 19, 2023 Self-Consistency Improves Chain of Thought Reasoning in Language Models #CoT #self-consistency
  • Apr 12, 2023 What is being transferred in transfer learning? #Representation
  • Apr 12, 2023 FREELB:ENHANCED ADVERSARIAL TRAINING FOR NATURAL LANGUAGE UNDERSTANDING #Adversarial Training
  • Mar 23, 2023 Mining Cross-Person Cues for Body-Part Interactiveness Learning #HOI
  • Mar 22, 2023 Chain-of-Thought Prompting Elicits Reasoning in Large Language Models #LLM #CoT
  • Mar 16, 2023 Visualizing the Loss Landscape of Neural Nets #Representation
  • Mar 15, 2023 Efficient Domain Adaptation of Language Models via Adaptive Tokenization #Adversarial Training
  • Feb 15, 2023 Fast Fourier Convolution #CNN
  • Feb 13, 2023 Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity #Few-shot Learning #In-context Learning
  • Jan 18, 2023 CVF-SID: Cyclic Multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise From Image #Denoising
  • Jan 11, 2023 Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study #Representation
  • Jan 11, 2023 MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer #Classification
  • Jan 11, 2023 End-to-End Object Detection with Transformers #Object_Detection
  • Jan 09, 2023 Self-training Imporves pre-training for natural language understanding #Self-training #Pre-training #NLU
  • 2022

  • Dec 12, 2022 Training Generative Adversarial Networks in One Stage #GAN
  • Nov 28, 2022 Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little #Language Modeling
  • Nov 14, 2022 Dice Loss for Data-imbalanced NLP Tasks #Loss #NLP
  • Nov 07, 2022 How Do Vision Transformers Work? #Representation
  • Nov 07, 2022 Denoising Diffusion Probabilistic Model #Diffusion
  • Nov 07, 2022 PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection #Object-Detection
  • Oct 31, 2022 Making Pre-trained Language Models Better Few-shot Learners #Few-shot learning
  • Oct 14, 2022 How good is your tokenizer? On the monolingual performance of multilingual language models #tokenizer #NLP
  • Sep 26, 2022 Introduction to Image Inpainting #Image inpainting
  • Sep 19, 2022 LinkBERT: Imporving Language Model Training with Document Link #Language Modeling
  • Aug 08, 2022 Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions #Classification
  • Aug 01, 2022 DiffCSE:Difference-based Contrastive Learning for Sentence Embeddings #Contrastive Learning #Representation
  • Jun 27, 2022 GPT3 : Language-Models are Few Shot Learners #Language Modeling
  • May 30, 2022 CornerNet: Detecting Objects as Paired Keypoints #Object-Detection
  • May 25, 2022 Language Models are Unsupervised Multitask Learners(GPT2) #Language Modeling
  • May 25, 2022 BioBERT : a pre-trained biomedical language representation model for biomedical text mining #Biomedical
  • May 02, 2022 You Only Look Once: Unified, Real-Time Object Detection #Object-Detection
  • Mar 28, 2022 Deep High-Resolution Representation Learning for Visual Recognition #Object-Detection
  • Mar 23, 2022 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations #Self-supervised
  • Mar 10, 2022 Attention is All you Need #Attention
  • All148
  • 2025 19
  • 2024 55
  • 2023 54
  • 2022 20

15588 경기도 안산시 상록구 한양대학로 55 (사동) 제 4공학관 408-1호


55, Hanyangdaehak-ro, Sangnok-gu, Ansan-si, Gyenggi-do