Instructions to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="TUM-EDA/Flui3d-Chat-Qwen3-Reasoning", filename="unsloth_qwen3_reasoning_bf16-00001-of-00014.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16 # Run inference directly in the terminal: llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16 # Run inference directly in the terminal: llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16 # Run inference directly in the terminal: ./llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Use Docker
docker model run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
- LM Studio
- Jan
- Ollama
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Ollama:
ollama run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
- Unsloth Studio new
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Reasoning to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Reasoning to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Reasoning to start chatting
- Pi new
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Run Hermes
hermes
- Docker Model Runner
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Docker Model Runner:
docker model run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
- Lemonade
How to use TUM-EDA/Flui3d-Chat-Qwen3-Reasoning with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull TUM-EDA/Flui3d-Chat-Qwen3-Reasoning:BF16
Run and chat with the model
lemonade run user.Flui3d-Chat-Qwen3-Reasoning-BF16
List all available models
lemonade list
Flui3d Chat Model Qwen 3 Reasoning
Model Description
This model is a Fine-tuned version of Qwen 3 designed for microfluidic chip design generation. The model incorporates Chain-of-Thought (CoT) reasoning to translate high-level design requirements into structured microfluidic system descriptions.
The model generates outputs in a structured JSON format following a predefined schema (see: Output Format). The generated JSON describes a complete microfluidic chip, including:
- microfluidic components
- component parameters
- channel connections
- structural relationships between elements
This allows the model to act as a design file generator for microfluidic systems, enabling automated or AI-assisted microfluidic chip design workflows.
The repository includes:
- LoRA Adapter weights
- Quantized, split GGUF model files compatible with Ollama, may require merging before use
GGUF files can be merged using tools provided by llama.cpp (see: Merging Split GGUF Files).
Intended Use
This model is intended for:
- Automated microfluidic chip design generation
- AI-assisted CAD workflows for microfluidics
- Research in AI-assisted scientific design
- Programmatic generation of microfluidic device specifications
The model converts natural language design requirements into structured microfluidic design specifications.
Example Applications
- Rapid prototyping of microfluidic devices
- Automated generation of chip layouts
- Integration with microfluidic CAD pipelines
- AI-driven design exploration
Model Architecture
- Base Model: Qwen 3 32B
- Fine-tuning Method: Cold-start SFT LoRA
- Reasoning Strategy: Chain-of-Thought prompting and supervision
- Output Format: Structured JSON
The model is trained to produce schema-compliant structured outputs representing microfluidic chip configurations.
Output Format
The model generates JSON objects conforming to a predefined schema.
Schema definition:
https://github.com/TUM-EDA/Flui3d-Chat/blob/master/Dataset%20and%20Training%20Framework/datasets/resources/json_schemas/microfluidic_schema.json
The JSON output typically includes:
- Component definitions
- Channel connections
- Parameterized microfluidic elements
- Junction definitions
Example Output
{
"connections": [
{
"source": "inlet_1",
"target": "mixer_1"
},
{
"source": "inlet_2",
"target": "mixer_1"
},
{
"source": "mixer_1",
"target": "outlet_1"
}
],
"junctions": [
{
"id": "junction_1",
"type": "T-junction",
"source_1": "inlet_1",
"source_2": "inlet_2",
"target": "mixer_1"
}
],
"component_params": {
"mixers": [
{
"id": "mixer_1",
"num_turnings": 4
}
],
"delays": [],
"chambers": [],
"filters": []
}
Repository Contents
This repository includes:
1. LoRA Adapter
The LoRA adapter can be loaded on top of the base Qwen model for inference or further fine-tuning.
2. Quantized GGUF Models
Quantized GGUF format models compatible with:
- Ollama
- llama.cpp
Due to file size limitations, the GGUF models are split into multiple parts. These files must be merged before use.
Merging Split GGUF Files
To merge the split GGUF files, use the merging utilities from llama.cpp:
https://github.com/ggml-org/llama.cpp/blob/master/tools/gguf-split/README.md
Usage with Ollama
The merged GGUF file can be used with:
- Ollama
Example prompt:
Design a microfluidic chip with two inlets, one mixer, and a single outlet.
Limitations
- The model assumes valid schema-based output format and may produce invalid JSON if prompts are poorly structured.
- Generated designs should be validated before fabrication.
- The model does not replace domain expert verification.
Citation
If you use this model in academic work, please cite:
WILL BE PUBLISHED
- Downloads last month
- 6
16-bit
Model tree for TUM-EDA/Flui3d-Chat-Qwen3-Reasoning
Base model
Qwen/Qwen3-32B