Since Yann LeCun together with Randall Balestriero released a new paper on JEPA (Joint-Embedding Predictive Architecture), laying out its theory and introducing an efficient practical version called LeJEPA, we figured you might need even more JEPA. Here are 7 recent JEPA variants plus 5 iconic ones:
6. TS-JEPA (Time Series JEPA) β Joint Embeddings Go Temporal (2509.25449) Adapts JEPA to time-series by learning latent self-supervised representations and predicting future latents for robustness to noise and confounders
CAG preloads document content into an LLMβs context as a precomputed key-value (KV) cache. This caching eliminates the need for real-time retrieval during inference, reducing token usage by up to 76% while maintaining answer quality.
CAG is particularly effective for constrained knowledge bases like internal documentation, FAQs, and customer support systems, where all relevant information can fit within the model's extended context window.