2 12 81

Kirill Gelvan

Kirili4ik

https://github.com/Kirili4ik

AI & ML interests

NLP, DL for Audio, Generative Models

Recent Activity

upvoted a collection about 2 months ago

SWE-rebench-V2

upvoted an article about 2 months ago

Mixture of Experts (MoEs) in Transformers

upvoted an article about 2 months ago

Mixture of Experts Explained

View all activity

Organizations

upvoted a collection about 2 months ago

SWE-rebench-V2

Collection

SWE-rebench-V2 is a curated dataset of software-engineering tasks derived from real GitHub issues and pull requests. • 3 items • Updated Mar 3 • 11

upvoted 2 articles about 2 months ago

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 159

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.12k

upvoted an article 4 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

burtenshaw, evalstate

•

Dec 4, 2025

• 624

liked a Space 6 months ago

The Smol Training Playbook

📚

3.17k

The secrets to building world-class LLMs

upvoted an article 6 months ago

Article

Granite 4.0 Nano: Just how small can you go?

ibm-granite

•

Oct 28, 2025

• 124

upvoted a collection 7 months ago

🦫 PIPer

Collection

All the resources for our paper "PIPer: On-Device Environment Setup via Online Reinforcement Learning"! • 9 items • Updated Oct 1, 2025 • 3

upvoted a paper 7 months ago

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Paper • 2509.25455 • Published Sep 29, 2025 • 38

liked a dataset 8 months ago

nebius/SWE-rebench

Viewer • Updated Dec 23, 2025 • 27.9k • 83.1k • 62

liked a model 11 months ago

sggetao/icae

Updated Mar 30, 2024 • 5

upvoted an article about 1 year ago

Article

CircleGuardBench: New Standard for Evaluating AI Moderation Models

whitecircle

•

May 7, 2025

• 60

upvoted a paper about 1 year ago

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers

Paper • 2504.20752 • Published Apr 29, 2025 • 95

liked a model about 1 year ago

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12, 2025 • 13.3M • • 1.27k

liked 2 datasets about 1 year ago

nebius/SWE-agent-trajectories

Viewer • Updated Dec 23, 2024 • 80k • 2.22k • 80

Solaris99/AgentBank

Viewer • Updated Oct 10, 2024 • 53.2k • 547 • 21

liked 2 models about 1 year ago

MadeAgents/Hammer2.1-7b

Updated Jun 12, 2025 • 307 • 32

watt-ai/watt-tool-8B

Updated Dec 20, 2024 • 56.1k • 117

liked a dataset about 1 year ago

Salesforce/xlam-function-calling-60k

Viewer • Updated Jan 24, 2025 • 60k • 15.1k • 612

liked a model about 1 year ago

Salesforce/xLAM-8x7b-r

Text Generation • 47B • Updated Apr 11, 2025 • 902 • 15

upvoted an article over 1 year ago

Article

Introduction to State Space Models (SSM)

lbourdois

•

Jul 19, 2024

• 223

Kirill Gelvan

AI & ML interests

Recent Activity

Organizations

Kirili4ik's activity

Mixture of Experts (MoEs) in Transformers

Mixture of Experts Explained

We Got Claude to Fine-Tune an Open Source LLM

The Smol Training Playbook

Granite 4.0 Nano: Just how small can you go?

CircleGuardBench: New Standard for Evaluating AI Moderation Models

Introduction to State Space Models (SSM)