nielsr HF Staff commited on
Commit
3a9aa7b
·
verified ·
1 Parent(s): 9fd8219

Add pipeline tag: feature-extraction

Browse files

This PR enhances the model card by adding the `pipeline_tag: feature-extraction` to the metadata. This accurately describes the model's core function of compressing text into continuous representations, making it more discoverable for users on the Hugging Face Hub who are looking for models performing text feature extraction.

Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -1,13 +1,14 @@
1
  ---
2
- license: cc-by-4.0
3
  language:
4
  - en
 
5
  tags:
6
  - model_hub_mixin
7
  - pytorch_model_hub_mixin
 
8
  ---
9
 
10
- # ARC-Encoder models
11
 
12
  This page houses `ARC8-Encoder_Llama` from three different versions of pretrained ARC-Encoders. Architectures and methods to train them are described in the paper *ARC-Encoder: learning compressed text representations for large language models* available [here](https://arxiv.org/abs/2510.20535). A code to reproduce the pretraining, further fine-tune the encoders or even evaluate them on dowstream tasks is available at [ARC-Encoder repository](https://github.com/kyutai-labs/ARC-Encoder/tree/main).
13
 
@@ -15,7 +16,7 @@ tags:
15
 
16
  All the encoders released here are trained on web crawl filtered using [Dactory](https://github.com/kyutai-labs/dactory) based on a [Llama3.2-3B](https://github.com/meta-llama/llama-cookbook) base backbone. It consists in two ARC-Encoder specifically trained for one decoder and one for two decoders in the same time:
17
  - `ARC8-Encoder_Llama`, trained on 2.6B tokens on [Llama3.1-8B](https://github.com/meta-llama/llama-cookbook) base specifically with a pooling factor of 8.
18
- - `ARC8-Encoder_Mistral`, trained on 2.6B tokens on [Mistral-7B](https://github.com/mistralai/mistral-finetune?tab=readme-ov-file) base specifically with a pooling factor of 8.
19
  - `ARC8-Encoder_multi`, trained by sampling among the two decoders with a pooling factor of 8.
20
 
21
  ### Uses
 
1
  ---
 
2
  language:
3
  - en
4
+ license: cc-by-4.0
5
  tags:
6
  - model_hub_mixin
7
  - pytorch_model_hub_mixin
8
+ pipeline_tag: feature-extraction
9
  ---
10
 
11
+ # ARC-Encoder models
12
 
13
  This page houses `ARC8-Encoder_Llama` from three different versions of pretrained ARC-Encoders. Architectures and methods to train them are described in the paper *ARC-Encoder: learning compressed text representations for large language models* available [here](https://arxiv.org/abs/2510.20535). A code to reproduce the pretraining, further fine-tune the encoders or even evaluate them on dowstream tasks is available at [ARC-Encoder repository](https://github.com/kyutai-labs/ARC-Encoder/tree/main).
14
 
 
16
 
17
  All the encoders released here are trained on web crawl filtered using [Dactory](https://github.com/kyutai-labs/dactory) based on a [Llama3.2-3B](https://github.com/meta-llama/llama-cookbook) base backbone. It consists in two ARC-Encoder specifically trained for one decoder and one for two decoders in the same time:
18
  - `ARC8-Encoder_Llama`, trained on 2.6B tokens on [Llama3.1-8B](https://github.com/meta-llama/llama-cookbook) base specifically with a pooling factor of 8.
19
+ - `ARC8-Encoder_Mistral`, trained on 2.6B tokens on [Mistral-7B](https://www.mistralai.com/tech/mistral-7b/) base specifically with a pooling factor of 8.
20
  - `ARC8-Encoder_multi`, trained by sampling among the two decoders with a pooling factor of 8.
21
 
22
  ### Uses