Spaces:

akhaliq
/

MobileLLM-Pro

Running on Zero

App Files Files Community

feat(optim): load the model and tokenizer outside of the spaces wrapped method

by raphael-gl HF Staff - opened 9 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

-1

raphael-gl

9 days ago

•

edited 9 days ago

On one side we lose the lazy init, but we benefit from the tensor packing on zero, so the model has a smaller memory footprint when idle. Besides, this way, callers do not consume their gpu quota to actually load the model. It is already downloaded, loaded in memory and prepared for serving

feat(optim): load the model and tokenizer outside of the spaces wrapped method3046b335

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment