共计 286 篇文章
2025
VISUAL AGENTS AS FAST AND SLOW THINKERS
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
五种LORA
Soft Instruction De-Escalation Defense
Jigsaw-Agile Community Rules Classification第一名方案
DoRA: Weight-Decomposed Low-Rank Adaptation
IBD:通过图像有偏解码减轻大型视觉-语言模型中的幻觉
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
VASparse:通过视觉感知的 token 稀疏化实现高效视觉幻觉缓解