Papers
arxiv:2511.16301

Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling

Published on Nov 20
· Submitted by minseokseo on Nov 25
Authors:
,
,

Abstract

Upsample Anything is a lightweight test-time optimization framework that enhances low-resolution features to high-resolution outputs without training, using an anisotropic Gaussian kernel for precise reconstruction in tasks like semantic segmentation and depth estimation.

AI-generated summary

We present Upsample Anything, a lightweight test-time optimization (TTO) framework that restores low-resolution features to high-resolution, pixel-wise outputs without any training. Although Vision Foundation Models demonstrate strong generalization across diverse downstream tasks, their representations are typically downsampled by 14x/16x (e.g., ViT), which limits their direct use in pixel-level applications. Existing feature upsampling approaches depend on dataset-specific retraining or heavy implicit optimization, restricting scalability and generalization. Upsample Anything addresses these issues through a simple per-image optimization that learns an anisotropic Gaussian kernel combining spatial and range cues, effectively bridging Gaussian Splatting and Joint Bilateral Upsampling. The learned kernel acts as a universal, edge-aware operator that transfers seamlessly across architectures and modalities, enabling precise high-resolution reconstruction of features, depth, or probability maps. It runs in only approx0.419 s per 224x224 image and achieves state-of-the-art performance on semantic segmentation, depth estimation, and both depth and probability map upsampling. Project page: https://seominseok0429.github.io/Upsample-Anything/{https://seominseok0429.github.io/Upsample-Anything/}

Community

Paper submitter
  • This paper presents a remarkably simple yet highly effective test-time optimization framework for feature upsampling.
  • The method is fully training-free and generalizes seamlessly across domains, tasks, and backbone architectures.
  • Its per-pixel anisotropic Gaussian formulation offers strong edge preservation and superior spatial fidelity compared to prior works.
  • The approach is computationally lightweight, scalable to high resolutions, and consistently achieves state-of-the-art performance.
  • Overall, this work provides a robust and universal upsampler that elegantly bridges JBU and Gaussian Splatting.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.16301 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.16301 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.16301 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.