Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
rivasmig 's Collections
void
Copy
Datasets
Simulations
Discussions
Medical
VLMs
Methods
Psychology
Utility

VLMs

updated Apr 19, 2025
Upvote
1

  • Task Vectors are Cross-Modal

    Paper • 2410.22330 • Published Oct 29, 2024 • 11

  • AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

    Paper • 2502.01341 • Published Feb 3, 2025 • 39

  • DASH: Detection and Assessment of Systematic Hallucinations of VLMs

    Paper • 2503.23573 • Published Mar 30, 2025 • 12

  • Kimi-VL Technical Report

    Paper • 2504.07491 • Published Apr 10, 2025 • 142

  • SmolVLM: Redefining small and efficient multimodal models

    Paper • 2504.05299 • Published Apr 7, 2025 • 208

  • Video-R1: Reinforcing Video Reasoning in MLLMs

    Paper • 2503.21776 • Published Mar 27, 2025 • 79
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs