Armaghan Shakir's picture

Armaghan Shakir

geetu040

·

AI & ML interests

Vision, Language and Vision-Language Models

Recent Activity

liked a model about 1 month ago

neuphonic/neutts-air

liked a model about 2 months ago

ibm-granite/granite-docling-258M

liked a model 3 months ago

microsoft/VibeVoice-1.5B

View all activity

Organizations

upvoted a collection 3 months ago

👁️ LFM2-VL

LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 9 items • Updated 14 days ago • 50

upvoted 2 articles 3 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

Aug 5

• 505

Article

Introducing Command A Vision: Multimodal AI built for Business

By

and 3 others •

Jul 31

• 63

upvoted an article 4 months ago

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Aug 17, 2022

• 114

upvoted 2 collections 4 months ago

🐕Small-Doges

Doge family of small language models! • 18 items • Updated Apr 21 • 11

💧 LFM2

LFM2 is a new generation of hybrid models, designed for on-device deployment. • 22 items • Updated 7 days ago • 118

upvoted an article 4 months ago

Article

cocogold: training Marigold for text-grounded segmentation

By

•

Jul 8

• 31

upvoted a paper 4 months ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

Paper • 2507.01955 • Published Jul 2 • 35

upvoted a collection 5 months ago

Gemma 3n

4 items • Updated Jul 10 • 237

upvoted an article 5 months ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 1.14k

upvoted 2 papers 5 months ago

SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation

Paper • 2506.18349 • Published Jun 23 • 13

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 270

upvoted 2 collections 5 months ago

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 22 days ago • 115

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 169

upvoted an article 6 months ago

Article

Fine-Tuning 1B LLaMA 3.2: A Comprehensive Step-by-Step Guide with Code

By

•

Oct 2, 2024

• 73

upvoted 2 papers 6 months ago

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12 • 132

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11 • 31

upvoted an article 7 months ago

Article

An Introduction to AI Model Optimization Techniques

By

and 1 other •

Apr 18

• 29

upvoted a collection 7 months ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 656

upvoted a paper 8 months ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171