Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

kyutai
/
stt-1b-en_fr

Automatic Speech Recognition
Moshi
Safetensors
English
French
stt
audio
Model card Files Files and versions
xet
Community
11

Instructions to use kyutai/stt-1b-en_fr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Moshi

    How to use kyutai/stt-1b-en_fr with Moshi:

    # pip install moshi
    # Run the interactive web server
    python -m moshi.server --hf-repo "kyutai/stt-1b-en_fr"
    # Then open https://localhost:8998 in your browser
    # pip install moshi
    import torch
    from moshi.models import loaders
    
    # Load checkpoint info from HuggingFace
    checkpoint = loaders.CheckpointInfo.from_hf_repo("kyutai/stt-1b-en_fr")
    
    # Load the Mimi audio codec
    mimi = checkpoint.get_mimi(device="cuda")
    mimi.set_num_codebooks(8)
    
    # Encode audio (24kHz, mono)
    wav = torch.randn(1, 1, 24000 * 10)  # [batch, channels, samples]
    with torch.no_grad():
        codes = mimi.encode(wav.cuda())
        decoded = mimi.decode(codes)
  • Notebooks
  • Google Colab
  • Kaggle
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

GGUF + pure-C++ runtime in CrispASR — Kyutai STT (Mimi codec + LM)

#11 opened 23 days ago by
cstr

AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe") - Error - Transformers does not recognize this architecture - model type `stt`

3
#10 opened 8 months ago by
jssaluja

What is the tokenization and alignment approach? i.e. collation

11
#9 opened 8 months ago by
RonanMcGovern

Improve model card: Add pipeline tag, paper link, and sample usage

#8 opened 9 months ago by
nielsr

Update README.md

#4 opened 11 months ago by
huhe-2024

Local Installation Video and Testing - Step by Step

#2 opened 11 months ago by
fahdmirzac

Thank you!

#1 opened 11 months ago by
ndgold
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs