|
|
--- |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- katanemo/Arch-Function-3B |
|
|
- cognitivecomputations/Dolphin3.0-Qwen2.5-3b |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- uncensored |
|
|
- function_calling |
|
|
- tool_use |
|
|
--- |
|
|
# Ramius |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
This is Ramius, an uncensored function calling model. |
|
|
|
|
|
### Model Description |
|
|
|
|
|
I needed an LLM for Home Assistant that is small and performant and I wanted one with some personality. |
|
|
Qwen2.5-3B is small, fast and can call functions pretty well. But it's [REDACTED], and doesn't like to roleplay. |
|
|
Arch-Function-3B is fantastic at calling functions, and absolutely nothing else. |
|
|
Dolphin3.0-Qwen2.5-3b is great at roleplay and refuses to refuse anything. But it sucks at calling functions. |
|
|
|
|
|
So I created Ramius with MergeKit to try and get the best of both. |
|
|
Plus I'm GPU poor and can't train. (Intel ARC cards come with buyer's remorse at no extra charge!) |
|
|
|
|
|
The result is... mediocre. It correctly calls functions most of the time, but it tends to hallucinate function responses instead of calling the actual function. |
|
|
But it does stay in character. YMMV. |
|
|
|
|
|
The name comes from Marko Ramius, a fictional communist submarine commander who defects to the United States in Tom Clancy's The Hunt for Red October. |
|
|
He's a former communist and the name sounded cool. |
|
|
|
|
|
I've included the F16 and Q4_0 weights. |
|
|
|
|
|
- **Developed by:** Other people's hard work. |
|
|
- **Funded by [optional]:** Also other people's hard work. |
|
|
- **Shared by [optional]:** Me. |
|
|
- **Model type:** Autoregressive transformer. |
|
|
- **Language(s) (NLP):** English, and others, probably. |
|
|
- **License:** [More Information Needed] |
|
|
|
|
|
### Model Sources [optional] |
|
|
|
|
|
Created with MergeKit. |
|
|
|
|
|
models: |
|
|
- model: katanemo/Arch-Function-3B |
|
|
lambda: 1.0 |
|
|
select_topk: 0.4 |
|
|
weight: 0.7 |
|
|
- model: cognitivecomputations/Dolphin3.0-Qwen2.5-3b |
|
|
density: 1.0 |
|
|
lambda: 1.0 |
|
|
select_topk: 0.6 |
|
|
weight: 0.3 |
|
|
merge_method: sce |
|
|
base_model: katanemo/Arch-Function-3B |
|
|
parameters: |
|
|
int8_mask: true |
|
|
normalize: true |
|
|
dtype: bfloat16 |
|
|
|
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
This is uncensored and does hallucinate. frequently. |
|
|
|
|
|
[More Information Needed] |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
I use this with Ollama and Home Assistant via the Extended OpenAI conversation integration. |
|
|
Works best with a top P of around 0.95 and temperature around 0.85. |
|
|
|
|
|
I also recommend you DO NOT put your entity states in your system prompt, and instead write functions to get the information. |
|
|
This will keep your system prompt static and more easily cached, which should reduce prompt processing time. |
|
|
|
|
|
|