James Chang

strategist922

strategist922

AI & ML interests

Multimodal Learning

Recent Activity

reacted to Kseniase's post with ❤️ 6 days ago

12 Types of JEPA Since Yann LeCun together with Randall Balestriero released a new paper on JEPA (Joint-Embedding Predictive Architecture), laying out its theory and introducing an efficient practical version called LeJEPA, we figured you might need even more JEPA. Here are 7 recent JEPA variants plus 5 iconic ones: 1. LeJEPA → https://huggingface.co/papers/2511.08544 Explains a full theory for JEPAs, defining the “ideal” JEPA embedding as an isotropic Gaussian, and proposes the SIGReg objective to push JEPA toward this ideal, resulting in practical LeJEPA 2. JEPA-T → https://huggingface.co/papers/2510.00974 A text-to-image model that tokenizes images and captions with a joint predictive Transformer, enhances fusion with cross-attention and text embeddings before training loss, and generates images by iteratively denoising visual tokens conditioned on text 3. Text-JEPA → https://huggingface.co/papers/2507.20491 Converts natural language into first-order logic, with a Z3 solver handling reasoning, enabling efficient, explainable QA with far lower compute than large LLMs 4. N-JEPA (Noise-based JEPA) → https://huggingface.co/papers/2507.15216 Connects self-supervised learning with diffusion-style noise by using noise-based masking and multi-level schedules, especially improving visual classification 5. SparseJEPA → https://huggingface.co/papers/2504.16140 Adds sparse representation learning to make embeddings more interpretable and efficient. It groups latent variables by shared semantic structure using a sparsity penalty while preserving accuracy 6. TS-JEPA (Time Series JEPA) → https://huggingface.co/papers/2509.25449 Adapts JEPA to time-series by learning latent self-supervised representations and predicting future latents for robustness to noise and confounders Read further below ↓ It you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

liked a model 21 days ago

moonshotai/Kimi-K2-Thinking

upvoted a paper 23 days ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

View all activity

Organizations

None yet

upvoted a paper 23 days ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 30 days ago • 113

upvoted a paper 2 months ago

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

Paper • 2509.13312 • Published Sep 16 • 104

upvoted 7 papers 3 months ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 73

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 188

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

Paper • 2506.21619 • Published Jun 23 • 3

K2-Think: A Parameter-Efficient Reasoning System

Paper • 2509.07604 • Published Sep 9 • 11

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 207

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 48

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 54

upvoted an article 4 months ago

Article

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

Aug 11

•

upvoted 4 papers 4 months ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Paper • 2506.18898 • Published Jun 23 • 33

upvoted a paper 5 months ago

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Paper • 2506.05551 • Published Jun 5 • 5

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

732

upvoted a paper 5 months ago

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Paper • 2501.18795 • Published Jan 30 • 12

upvoted a collection 6 months ago

dots.llm1

Collection

2 items • Updated Jun 11 • 18

upvoted 2 papers 7 months ago

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 24

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

Paper • 2403.12881 • Published Mar 19, 2024 • 18