2 16 3

Taehyeon Kim

Kthyeon

https://taehyeon.oopy.io/

AI & ML interests

LLM Inference: Parallel, Speculative, Instructive Decoding

Recent Activity

liked a model about 14 hours ago

LGAI-EXAONE/K-EXAONE-236B-A23B

upvoted a paper 1 day ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

upvoted a paper 3 months ago

Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models

View all activity

Organizations

liked a model about 14 hours ago

LGAI-EXAONE/K-EXAONE-236B-A23B

Text Generation • 237B • Updated about 15 hours ago • 11 • 155

upvoted a paper 1 day ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published 8 days ago • 16

upvoted 2 papers 3 months ago

Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models

Paper • 2510.11057 • Published Oct 13, 2025 • 30

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 116

upvoted 2 papers 6 months ago

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Paper • 2507.11407 • Published Jul 15, 2025 • 58

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14, 2025 • 70

upvoted 2 papers 7 months ago

Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Paper • 2505.22960 • Published May 29, 2025 • 16

Flex-Judge: Think Once, Judge Anywhere

Paper • 2505.18601 • Published May 24, 2025 • 27

upvoted a paper 8 months ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21, 2025 • 44

liked a model 10 months ago

LGAI-EXAONE/EXAONE-Deep-32B

Text Generation • 32B • Updated Mar 19, 2025 • 1.38k • 298

New activity in microsoft/Phi-3-small-8k-instruct about 1 year ago

Getting the error: "triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 180224, Hardware limit: 166912. Reducing block sizes or `num_stages` may help."

👍 4

#27 opened over 1 year ago by

Pranav0511

upvoted a paper over 1 year ago

Phantom of Latent for Large Language and Vision Models

Paper • 2409.14713 • Published Sep 23, 2024 • 28

liked a model over 1 year ago

facebook/multi-token-prediction

Updated Jun 18, 2024 • 371

upvoted a paper over 1 year ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 104

authored a paper over 1 year ago

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

Paper • 2406.16758 • Published Jun 24, 2024 • 20

upvoted 2 papers over 1 year ago

Towards Fast Inference: Exploring and Improving Blockwise Parallel Drafts

Paper • 2404.09221 • Published Apr 14, 2024 • 1

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

Paper • 2406.16758 • Published Jun 24, 2024 • 20

commented a paper over 1 year ago

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

Paper • 2406.16758 • Published Jun 24, 2024 • 20 •

upvoted a paper over 1 year ago

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41

authored a paper over 1 year ago

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41

Taehyeon Kim

AI & ML interests

Recent Activity

Organizations

Kthyeon's activity

Getting the error: "triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 180224, Hardware limit: 166912. Reducing block sizes or `num_stages` may help."