1 23 86

Peter Tanski

pdtgct

AI & ML interests

Machine Learning, Artificial Intelligence

Recent Activity

liked a model 2 days ago

amd/Zebra-Llama-8B-8MLA-24Mamba-SFT

upvoted a paper about 1 month ago

Top-nσ: Not All Logits Are You Need

upvoted an article about 2 months ago

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

View all activity

Organizations

upvoted a paper about 1 month ago

Top-nσ: Not All Logits Are You Need

Paper • 2411.07641 • Published Nov 12, 2024 • 23

upvoted an article about 2 months ago

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Aug 8

•

upvoted a paper 3 months ago

A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14 • 34

upvoted an article 4 months ago

Article

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

May 21

•

upvoted a paper 4 months ago

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30, 2024 • 79

upvoted a paper 5 months ago

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Paper • 2506.11886 • Published Jun 13 • 20

upvoted an article 6 months ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

May 28

•

upvoted a paper 7 months ago

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 29

upvoted an article 8 months ago

Article

Open R1: How to use OlympicCoder locally for coding

Mar 20

•

upvoted 2 papers 11 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 90

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 13

upvoted a paper about 1 year ago

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 54

upvoted 2 articles about 1 year ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

•

191

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

•

271

upvoted a paper about 1 year ago

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Paper • 2408.11049 • Published Aug 20, 2024 • 13

upvoted a paper over 1 year ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 71

upvoted 2 collections over 1 year ago

Sparse Foundational Llama 2 Models

Collection

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Apr 16 • 9

A little guide to building Large Language Models in 2024

Collection

Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1, 2024 • 17

upvoted 2 papers over 1 year ago

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

Paper • 2403.05313 • Published Mar 8, 2024 • 9

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 51

Peter Tanski

AI & ML interests

Recent Activity

Organizations

pdtgct's activity

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

CodeAgents + Structure: A Better Way to Execute Actions

Open R1: How to use OlympicCoder locally for coding

Llama can now see and run on your device - welcome Llama 3.2

Fine-tuning LLMs to 1.58bit: extreme quantization made easy