CLaRa-7B-Base / README.md
yizheapple's picture
Update README.md
b0bff61 verified
---
license: apple-amlr
base_model:
- mistralai/Mistral-7B-Instruct-v0.2
tags:
- rag
- compression
- retrieval
- generation
---
# CLaRa-7B-Base (Compression-16 & 128)
The CLaRa-7B-Base model is our foundational unified RAG model with built-in semantic document compression (16× and 128x).
It provides a base compressor + generator capable of producing answers directly from compressed document representations.
**Training recipe:** Trained using QA-guided semantic compression and paraphrase consistency objectives.
**Benchmarks:** Strong baseline performance across multi-hop QA tasks under a 16× compression ratio.
---
## More details and usage examples:
Paper: [CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning](https://arxiv.org/abs/2511.18659)
GitHub: https://github.com/apple/ml-clara
---
## Example Usage
```python
from transformers import AutoModel
unirag = AutoModel.from_pretrained(
"/mnt/ceph_rbd/model/CLaRa-7B-Base/compression-16",
trust_remote_code=True
).to("cuda")
documents = [
[
"Weldenia is a monotypic genus of flowering plant in the family Commelinaceae...",
"Hagsatera is a genus of orchids native to Mexico and Guatemala...",
"Alsobia is a genus of flowering plants native to Mexico and Central America..."
]
]
questions = [""]
out = unirag.generate_from_paraphrase(
questions=questions,
documents=documents,
max_new_tokens=64
)
print("Generated answer:", out)