-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper ⢠2401.02954 ⢠Published ⢠51 -
Qwen Technical Report
Paper ⢠2309.16609 ⢠Published ⢠37 -
GPT-4 Technical Report
Paper ⢠2303.08774 ⢠Published ⢠7 -
Gemini: A Family of Highly Capable Multimodal Models
Paper ⢠2312.11805 ⢠Published ⢠47
Collections
Discover the best community collections!
Collections including paper arxiv:2312.17661
-
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper ⢠2311.00571 ⢠Published ⢠43 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper ⢠2311.05437 ⢠Published ⢠51 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper ⢠2310.08166 ⢠Published ⢠1 -
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Paper ⢠2310.00653 ⢠Published ⢠3
-
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper ⢠2401.00849 ⢠Published ⢠17 -
Learning Vision from Models Rivals Learning Vision from Data
Paper ⢠2312.17742 ⢠Published ⢠16 -
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models
Paper ⢠2312.17661 ⢠Published ⢠15 -
A Vision Check-up for Language Models
Paper ⢠2401.01862 ⢠Published ⢠11
-
Attention Is All You Need
Paper ⢠1706.03762 ⢠Published ⢠108 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ⢠2307.08691 ⢠Published ⢠9 -
Mixtral of Experts
Paper ⢠2401.04088 ⢠Published ⢠160 -
Mistral 7B
Paper ⢠2310.06825 ⢠Published ⢠56
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper ⢠2401.02954 ⢠Published ⢠51 -
Qwen Technical Report
Paper ⢠2309.16609 ⢠Published ⢠37 -
GPT-4 Technical Report
Paper ⢠2303.08774 ⢠Published ⢠7 -
Gemini: A Family of Highly Capable Multimodal Models
Paper ⢠2312.11805 ⢠Published ⢠47
-
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper ⢠2401.00849 ⢠Published ⢠17 -
Learning Vision from Models Rivals Learning Vision from Data
Paper ⢠2312.17742 ⢠Published ⢠16 -
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models
Paper ⢠2312.17661 ⢠Published ⢠15 -
A Vision Check-up for Language Models
Paper ⢠2401.01862 ⢠Published ⢠11
-
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper ⢠2311.00571 ⢠Published ⢠43 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper ⢠2311.05437 ⢠Published ⢠51 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper ⢠2310.08166 ⢠Published ⢠1 -
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Paper ⢠2310.00653 ⢠Published ⢠3
-
Attention Is All You Need
Paper ⢠1706.03762 ⢠Published ⢠108 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ⢠2307.08691 ⢠Published ⢠9 -
Mixtral of Experts
Paper ⢠2401.04088 ⢠Published ⢠160 -
Mistral 7B
Paper ⢠2310.06825 ⢠Published ⢠56