Seminar | Data Science Lab

Light

Dark

2025

Aug 26, 2025 Do As I Can, Not As I Say : Grounding Language in Robotic Affordances #LLM #Reinforcement Learning
Aug 19, 2025 Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse #LLM evaluation #RAG
Aug 12, 2025 Learning to Reason from Feedback at Test-Time #Test-time training
Aug 05, 2025 SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback #RL
Jul 29, 2025 TriSampler: A Better Negative Sampling Principle for Dense Retrieval #Retriever #Negative Sampling
Jul 22, 2025 Transformers without Normalization #Computer Vision #LLM
Jul 15, 2025 Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning #RL
Jul 08, 2025 Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens #VLM
Jun 30, 2025 DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding #Multimodal #VLM
Jun 23, 2025 MixLLM: Dynamic Routing in Mixed Large Language Models #LLM Routing #LLM
Jun 16, 2025 Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents #Benchmark #Agent
Jun 09, 2025 SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains #RAG #Self-Training
Jun 02, 2025 Harnessing the Universal Geometry of Embeddings #Text Embedding
May 26, 2025 RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement #LLM #RAG
May 19, 2025 World models #World model
May 12, 2025 K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor #LLM #RAG
May 07, 2025 Diffusion Feedback Helps CLIP See Better #MLLM #Diffusion
Apr 28, 2025 Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs #Retriever
Apr 23, 2025 Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs #MLLM
Apr 14, 2025 Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation #Diffusion
Apr 07, 2025 RouteLLM: Learning to Route LLMs with Preference Data #LLM Routing
Mar 31, 2025 ARES : Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback #Multi-Modal #Reinforcement Learning
Mar 24, 2025 MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs #Multi-Modal #MLLM
Mar 17, 2025 The Hyperfitting Phenomenon:Sharpening and Stabilizing LLMs for Open-Ended Text Generation #LLM
Mar 10, 2025 Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models #3D Motion #Diffusion
Mar 04, 2025 Mixture-of-Agents Enhances Large Languague Model Capabilities #Agent #LLM
Feb 25, 2025 Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation #LLM #RAG
Feb 25, 2025 Chain-of-Thought Reasoning Without Prompting #Decoding
Feb 18, 2025 ReAct: Synergizing Reasoning and Acting in Language Models #Agent #NLP
Feb 18, 2025 Image Captioners Are Scalable Vision Learners Too #Multimodal #Vision-Language Pretraining
Jan 07, 2025 Multimodal Procedural Planning via Dual Text-Image Prompting #Multimodal

2024

Dec 31, 2024 Retrieval Augmented Geneartion or Long-Context LLMs? A Comprehensive Study and Hybrid Approach #RAG #LLM
Dec 24, 2024 VisualWebArena: Evaluating Multimodal Agents on Realistic Visually Grounded Web Tasks #Agent
Dec 17, 2024 Guiding a Diffusion Model with a Bad Version of Itself #Image generation
Dec 10, 2024 The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation #Persuasive Misinformation #LLM
Dec 02, 2024 UniIR: Training and Benchmarking Universal Multimodal Information Retrievers #Retrieval
Nov 19, 2024 Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration #Hallucination #LLM
Nov 07, 2024 Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models #VLM #Representation Learning
Nov 05, 2024 Proving Test Set Contamination in Black-Box Language Models #Test Set Contamination
Oct 29, 2024 DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models #RAG #LLM
Oct 22, 2024 Iterated Learning Improves Compositionality in Large Vision-Language Models #VLM
Oct 16, 2024 What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning #Data Selection #LLM #Instruction Tuning
Oct 02, 2024 SELF-REFINE: Iterative Refinement with Self-Feedback #Hallucination #LLM
Sep 25, 2024 API-Assisted Code Generation for Question Answering on Varied Table Structures #Tabular data
Sep 11, 2024 LLaVA-OneVision: Easy Visual Task Transfer #Multimodal
Sep 04, 2024 Improving Text Embeddings with Large Language Models #Text Embedding
Aug 27, 2024 Faster Minimum Bayes Risk Decoding with Confidence-based Pruning #MBR
Aug 21, 2024 Images Speak in Images: A Generalist Painter for In-Context Visual Learning #Image generation #ViT
Aug 20, 2024 Merging Generated and Retrieved Knowledge for Open-Domain QA #RAG #LLM
Aug 13, 2024 Lost in the Middle: How Language Models Use Long Contexts #Long-Context #Question Answering #Information Retrieval
Aug 06, 2024 G-EVAL: NLG Evaluation using GPT-4 with Better Human Alignment #Evaluation
Jul 31, 2024 Null-text Inversion for Editing Real Images using Guided Diffusion Models #Image Editing
Jul 30, 2024 E5: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate #Tabular data
Jul 23, 2024 Text Embeddings Reveal (Almost) As Much As Text #Embedding Inversion #Text Embedding
Jul 17, 2024 ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts #Multimodal
Jul 03, 2024 Segment Anything #Image-Segmentation
Jul 02, 2024 Are Emergent Abilities of Large Language Models a Mirage? #Emergent Abilities
Jun 25, 2024 Learning to Retrieve In-Context Examples for Large Language Models #RAG #LLM
Jun 19, 2024 Exploring Simple Siamese Representation Learning #Representation Learning
Jun 16, 2024 LIMA: Less Is More for Alignment #LLM #Instruction Tuning #Chat Assistant
Jun 05, 2024 Visual Instruction Tuning #Multimodal
May 22, 2024 Time to Shine: Fine-Tuning Object Detection Models with Synthetic Adverse Weather Images #Object-Detection #Image generation
May 21, 2024 The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models #Hallucination #LLM
May 14, 2024 CABINET: Content Relevance based Noise Reduction for Table Question Answering #Tabular data
May 08, 2024 Best of Both Worlds: Learning Arbitrary-scale Blind Super-Resolution via Dual Degradation Representations and Cycle-Consistency #Super-Resolution
May 01, 2024 Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering #VQA
Apr 30, 2024 Look-back Decoding for Open-Ended Text Generation #Decoding
Apr 23, 2024 SLIDE: Reference-free Evaluation for Machine Translation using a Sliding Document Window #Evaluation
Apr 17, 2024 SSSD: Self-Supervised Self Distillation #Distillation
Apr 16, 2024 Self-Knowledge Guided Retrieval Augmentation for Large Language Models #RAG #LLM
Apr 09, 2024 RECOMP: Improving Retrieval-augmented LMs with Compression and Selective Augmentation #RAG #Question Answering #Summarization
Apr 02, 2024 Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy #RAG
Mar 27, 2024 Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding #Image generation
Mar 26, 2024 Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection #RAG
Mar 13, 2024 DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting #Object-Detection
Mar 12, 2024 In-Context Retrieval-Augmented Language Models #RAG #Information Retrieval
Mar 06, 2024 Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models #QA
Feb 27, 2024 Understanding Dark Scenes by Contrasting Multi-Modal Observations #Image-Segmentation
Feb 27, 2024 Active Retrieval Augmented Generation #RAG #LLM
Feb 27, 2024 BEIT: BERT Pre-Training of Image Transformers #BERT #ViT
Feb 16, 2024 Generate Rather than Retrieve: Large Language Models are Strong Context Generators #Prompt Learning #LLM #Question Answering
Feb 13, 2024 ARNIQA: Learning Distortion Manifold for Image Quality Assessment #IQA
Feb 06, 2024 CRITIC: LARGE LANGUAGE MODELS CAN SELFCORRECT WITH TOOL-INTERACTIVE CRITIQUING #Hallucination #LLM
Jan 30, 2024 Reasoning with Language Model is Planning with World Model #planning #LLM
Jan 16, 2024 Extrinsic Evaluation of Machine Translation Metrics #Evaluation
Jan 02, 2024 SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model #Image Inpainting

2023

Dec 27, 2023 Shape-biased CNNs are Not Always Superior in Out-of-Distribution Robustness #Representation
Dec 27, 2023 HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models #Hallucination #LLM
Dec 13, 2023 Oriented R-CNN for Object Detection #RCNN #CV
Dec 13, 2023 Scaling Instruction-Finetuned Language Models #Instruction Tuning #LLM
Nov 22, 2023 Atlas: Few-shot Learning with Retrieval Augmented Language Models #RAG #Information Retrieval
Nov 15, 2023 Evaluation Metrics for Text Generation (BERTScore, BARTScore) #Evaluation
Nov 15, 2023 Robust Speech Recognition via Large-Scale Weak Supervision #Representation
Nov 08, 2023 Rethinking Fast Fourier Convolution in Image Inpainting #Image Inpainting
Nov 08, 2023 Deep Preset: Blending and Retouching Photos with Color Style Transfer #Style Transfer #CV
Oct 25, 2023 Soft Augmentation for Image Classification #Image Classification #CV
Oct 25, 2023 LLM BLENDER: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion #Ensemble
Oct 18, 2023 Predicting Numerals in Text Using Nearest Neighbor Language Models #Numerical Data
Oct 11, 2023 Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker #Reasoning #LLM
Oct 11, 2023 ImageNet Pre-training also Transfers Non-robustness #Representation
Sep 27, 2023 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis #3D Rendering
Sep 20, 2023 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks #RAG #Information Retrieval
Sep 13, 2023 Causes and Cures for Interference in Multilingual Translation #Machine Translation
Sep 13, 2023 Neural Preset for Color Style Transfer #Style Transfer #CV
Sep 06, 2023 Do Androids Laugh at Electric Sheep? Humor “Understanding” Benchmarks from The New Yorker Caption Contest #LLM
Aug 28, 2023 Mask-guided Matting in the Wild #Matting
Aug 21, 2023 Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture #Self-supervised
Aug 21, 2023 Generative Agents: Interactive Simulacra of Human Behavior #LLM #Agent
Aug 14, 2023 Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems #Dialogue State Tracking #Dialogue System
Aug 14, 2023 DESIGNING BERT FOR CONVOLUTIONAL NETWORKS: SPARSE AND HIERARCHICAL MASKED MODELING #CNN #CV
Aug 07, 2023 P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks #PEFT
Jul 31, 2023 Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring #Deblurring
Jul 31, 2023 Deep Frequency Filtering for Domain Generalization #Domain Generalization
Jul 24, 2023 RLPROMPT: Optimizing Discrete Text Prompts with Reinforcement Learning #Reinforcement Learning
Jul 23, 2023 Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations #Representation
Jul 17, 2023 Detecting and Mitigating Hallucinations in Machine Translation #Machine Translation
Jul 10, 2023 Self-Instruct: Aligning Language Models with Self-Generated Instructions #Instruction Tuning #LLM
Jul 03, 2023 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? #In-Context Learning
Jun 26, 2023 Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models #Text Style Transfer
Jun 26, 2023 Towards Open World Object Detection #Object-Detection
Jun 09, 2023 CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying #Image Inpainting
Jun 02, 2023 Proposal-Contrastive Pretraining for Object Detection from Fewer Data #Representation
May 31, 2023 LoRA: Low-Rank Adaptation of Large Language Models #LoRA #PEFT
May 12, 2023 Training Independent Subnetworks for Robust Prediction #MIMO #Uncertainty
May 10, 2023 A Recipe For Arbitrary Text Style Transfer with Large Language Models #Text Style Transfer #LLM
Apr 21, 2023 SiamMOT:Siamese Multi-Object Tracking #Tracking
Apr 19, 2023 Self-Consistency Improves Chain of Thought Reasoning in Language Models #CoT #self-consistency
Apr 12, 2023 What is being transferred in transfer learning? #Representation
Apr 12, 2023 FREELB:ENHANCED ADVERSARIAL TRAINING FOR NATURAL LANGUAGE UNDERSTANDING #Adversarial Training
Mar 23, 2023 Mining Cross-Person Cues for Body-Part Interactiveness Learning #HOI
Mar 22, 2023 Chain-of-Thought Prompting Elicits Reasoning in Large Language Models #LLM #CoT
Mar 16, 2023 Visualizing the Loss Landscape of Neural Nets #Representation
Mar 15, 2023 Efficient Domain Adaptation of Language Models via Adaptive Tokenization #Adversarial Training
Feb 15, 2023 Fast Fourier Convolution #CNN
Feb 13, 2023 Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity #Few-shot Learning #In-context Learning
Jan 18, 2023 CVF-SID: Cyclic Multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise From Image #Denoising
Jan 11, 2023 Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study #Representation
Jan 11, 2023 MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer #Classification
Jan 11, 2023 End-to-End Object Detection with Transformers #Object_Detection
Jan 09, 2023 Self-training Imporves pre-training for natural language understanding #Self-training #Pre-training #NLU

2022

Dec 12, 2022 Training Generative Adversarial Networks in One Stage #GAN
Nov 28, 2022 Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little #Language Modeling
Nov 14, 2022 Dice Loss for Data-imbalanced NLP Tasks #Loss #NLP
Nov 07, 2022 How Do Vision Transformers Work? #Representation
Nov 07, 2022 Denoising Diffusion Probabilistic Model #Diffusion
Nov 07, 2022 PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection #Object-Detection
Oct 31, 2022 Making Pre-trained Language Models Better Few-shot Learners #Few-shot learning
Oct 14, 2022 How good is your tokenizer? On the monolingual performance of multilingual language models #tokenizer #NLP
Sep 26, 2022 Introduction to Image Inpainting #Image inpainting
Sep 19, 2022 LinkBERT: Imporving Language Model Training with Document Link #Language Modeling
Aug 08, 2022 Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions #Classification
Aug 01, 2022 DiffCSE:Difference-based Contrastive Learning for Sentence Embeddings #Contrastive Learning #Representation
Jun 27, 2022 GPT3 : Language-Models are Few Shot Learners #Language Modeling
May 30, 2022 CornerNet: Detecting Objects as Paired Keypoints #Object-Detection
May 25, 2022 Language Models are Unsupervised Multitask Learners(GPT2) #Language Modeling
May 25, 2022 BioBERT : a pre-trained biomedical language representation model for biomedical text mining #Biomedical
May 02, 2022 You Only Look Once: Unified, Real-Time Object Detection #Object-Detection
Mar 28, 2022 Deep High-Resolution Representation Learning for Visual Recognition #Object-Detection
Mar 23, 2022 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations #Self-supervised
Mar 10, 2022 Attention is All you Need #Attention