π§ NanoAgent β 135M Parameter Agentic LLM
NanoAgent is a compact 135M parameter, 8k context-length language model trained to perform tool calls and generate responses based on tool outputs.
Despite its small size (~135 MB in 8-bit precision), itβs optimized for agentic use cases and runs easily on personal devices.
β¨ Features
- π§° Tool Calling β understands and responds with structured outputs from tool calls.
 - π§ Instruction Following β strong instruction following abilities.
 - π§ Basic Reasoning β handles lightweight reasoning and ReAct-style interactions.
 - β‘ Lightweight β runs on local hardware with minimal resources.
 
π§ͺ Training Overview
Base model: SmolLM2-135M-Instruct
Fine-tuning method: Dynamic Fine-Tuning (DFT)
Hardware: Apple Mac M1 (16 GB Unified Memory) using MLX.
π Datasets Used
microsoft/orca-agentinstruct-1M-v1β agentic tasks, RAG answers, classificationmicrosoft/orca-math-word-problems-200kβ lightweight reasoningallenai/tulu-3-sft-personas-instruction-followingβ instruction followingxingyaoww/code-actβ ReAct style reasoning and actionm-a-p/Code-Feedbackβ alignment via feedbackHuggingFaceTB/smoltalk+/apigenβ tool calling stabilizationweijie210/gsm8k_decomposedβ question decompositionLocutusque/function-calling-chatmlβ tool call response structure
β οΈ Disclaimer
This is a beta model.
- It may produce incorrect or incomplete outputs.
 - Tool call execution is basic and can fail in some cases.
 - Intended for research and experimentation only β not production use.
 
π§ Roadmap
- β Initial release with DFT fine-tuning
 - π§ͺ Benchmarking on agentic tasks
 π¬ Experimenting with GRPO for tool calling (failed)- π§ Weight merging experiments for improved performance
 - Add more tool calling dataset
 
π₯ Model Size
- 135M parameters
 - ~135 MB in 8-bit precision
 - 8k context length
 
π§ͺ Benchmarks
Benchmarks are conducted with temperature=0 and without sampling for fair evaluation using llm_eval. 
| Metric / Task | SmolLM2-135M-Instruct | NanoAgent | 
|---|---|---|
| Parameters | 135M | 135M | 
| Context Length | 8k | 8k | 
| IFEval Score (Overall) | 5.69 | 9.46 | 
| MMLU | 22.96 | 23.07 | 
| Commonsense QA | 19.66 | 19.57 | 
β‘ Example Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "quwsarohi/NanoAgent-135M"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def inference(messages, max_new_tokens=256, temperature=0.3, min_p=0.15, **kwargs):
    input_text = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    inputs = tokenizer.encode(input_text, return_tensors="pt")
    outputs = model.generate(
        inputs,
        max_new_tokens=max_new_tokens,
        do_sample=True,
        min_p=0.15,
        temperature=temperature,
        **kwargs
    )
    return tokenizer.decode(outputs[0][inputs.shape[1] :], skip_special_tokens=True)
messages = [{"role": "user", "content": "Hi! Do you have a name?"}]
print(inference(messages))
Use the following template for tool calling:
TOOL_TEMPLATE = """You are a helpful AI assistant. You have a set of possible functions/tools inside <tools></tools> tags. 
Based on question, you may need to make one or more function/tool calls to answer user.
You have access to the following tools/functions:
<tools>{tools}</tools>
For each function call, return a JSON list object with function name and arguments within <tool_call></tool_call> tags."""
Sample tool call definition:
{
  "name": "web_search",
  "description": "Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, links, and descriptions.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query to perform.",
      }
    },
    "required": ["query"],
  },
}
- Downloads last month
 - 177
 
Model tree for quwsarohi/NanoAgent-135M
Base model
HuggingFaceTB/SmolLM2-135M
				Quantized
	
	
HuggingFaceTB/SmolLM2-135M-Instruct