VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published about 1 month ago • 8
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published Apr 8 • 41