-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2509.20354
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 22 -
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation
Paper • 2502.01068 • Published • 18 -
Scaling Embedding Layers in Language Models
Paper • 2502.01637 • Published • 24
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 18 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
EmbeddingGemma: Powerful and Lightweight Text Representations
Paper • 2509.20354 • Published • 39 -
CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity
Paper • 2508.11442 • Published • 3 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 74 -
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Paper • 2402.03216 • Published • 6
-
Tesslate/UIGEN-X-8B
Text Generation • 8B • Updated • 32 • • 58 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 86 • 100 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 10 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 46
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 18 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
EmbeddingGemma: Powerful and Lightweight Text Representations
Paper • 2509.20354 • Published • 39 -
CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity
Paper • 2508.11442 • Published • 3 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 74 -
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Paper • 2402.03216 • Published • 6
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 21 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
Tesslate/UIGEN-X-8B
Text Generation • 8B • Updated • 32 • • 58 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 86 • 100 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 10 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 46
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 22 -
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation
Paper • 2502.01068 • Published • 18 -
Scaling Embedding Layers in Language Models
Paper • 2502.01637 • Published • 24