CodeLlama-7B Fine-Tuned for Verilog FIFO Generation

Model Description

This is a LoRA fine-tuned CodeLlama-7B-Instruct model specialized in generating clean, synthesizable Verilog/SystemVerilog code for FIFO (First-In-First-Out) designs. The model was trained on comprehensive FIFO specifications following industry best practices for RTL design.

Model Type: Causal Language Model (LoRA Adapter)
Base Model: codellama/CodeLlama-7B-Instruct-hf
Training Date: 2025-11-25
Organization: Elinnos Systems

Key Features

βœ… Complete Code Generation - Generates full Verilog modules with proper structure
βœ… Synthesizable RTL - Clean, synthesizable code following best practices
βœ… FIFO Variants - Supports various FIFO architectures (synchronous, asynchronous, parameterized)
βœ… Protocol Compliance - Proper signal handling, full/empty flags, error handling
βœ… Production-Ready - No comments, no debug statements, clean functional code

Training Details

Dataset

  • Training Samples: 70
  • Validation Samples: 9
  • Test Samples: 15
  • Total Samples: 94
  • Dataset Format: CodeLlama chat template format
  • Average Sequence Length: 1536 tokens

Training Configuration

  • Base Model: codellama/CodeLlama-7B-Instruct-hf
  • Training Method: LoRA (Low-Rank Adaptation)
  • Quantization: 4-bit NF4 (for GPU memory efficiency)
  • Training Steps: 25 steps (5 epochs)
  • Epochs: 5
  • Batch Size: 4
  • Effective Batch Size: 16 (with gradient accumulation)
  • Learning Rate: 2e-05
  • Max Sequence Length: 1536
  • Hardware: NVIDIA A100-SXM4-40GB

LoRA Configuration

from peft import LoraConfig

LoraConfig(
    r=48,                    # Rank
    lora_alpha=96,  # Scaling factor
    target_modules=['q_proj', 'up_proj', 'down_proj', 'gate_proj', 'o_proj', 'k_proj', 'v_proj'],
    lora_dropout=0.15,
    bias="none",
    task_type="CAUSAL_LM"
)

Target Modules:

  • q_proj
  • up_proj
  • down_proj
  • gate_proj
  • o_proj
  • k_proj
  • v_proj

Trainable Parameters: ~672M (LoRA adapter only)

Model Capabilities

What This Model Can Do

  1. Synchronous FIFO Design

    • Dual-port memory architecture
    • Write/Read pointer management
    • Full/Empty flag generation
    • Configurable width and depth
    • Error handling (write_err, read_err)
    • Threshold flags (almost_full, almost_empty)
    • Occupancy output
  2. Asynchronous FIFO Design

    • Gray code pointer synchronization
    • Clock domain crossing (CDC)
    • Separate read/write clocks
    • Data valid signals
  3. Advanced Features

    • Parameterized FIFO modules
    • AXI-like handshake protocols
    • Pipelined output stages
    • Peek capability
    • Clear signal support
  4. Code Quality

    • Clean, synthesizable RTL
    • No comments or debug statements
    • Proper signal naming conventions
    • Modular design structure

Usage

Installation

pip install transformers peft torch bitsandbytes accelerate

Loading the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
from transformers import BitsAndBytesConfig

# Load base model with 4-bit quantization
base_model = AutoModelForCausalLM.from_pretrained(
    "codellama/CodeLlama-7B-Instruct-hf",
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.float16,
    ),
    device_map="auto",
    trust_remote_code=True
)

# Load fine-tuned LoRA adapter
model = PeftModel.from_pretrained(
    base_model, 
    "Elinnos/codellama-7b-fifo-verilog"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7B-Instruct-hf")
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = tokenizer.eos_token_id

Generating Code

# Create prompt using CodeLlama chat template
system_prompt = "You are Elinnos RTL Code Generator v1.0, a specialized Verilog/SystemVerilog code generation agent. Your role: Generate clean, synthesizable RTL code for hardware design tasks. Output ONLY functional RTL code with no $display, assertions, comments, or debug statements."

user_prompt = "Generate a synchronous FIFO with 8-bit data width, depth 4, write_enable, read_enable, full flag, empty flag."

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=800,
    temperature=0.3,
    do_sample=True,
    top_p=0.9,
    repetition_penalty=1.2,
    eos_token_id=tokenizer.eos_token_id
)

# Decode response
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False)
print(response)

Using the Simple Inference Script

# From the codellama-migration directory
python3 simple_finetuned_model_inference.py \
    --prompt "Generate a synchronous FIFO with 32-bit data width, depth 16" \
    --model-path "Elinnos/codellama-7b-fifo-verilog"

Training Code

The model was trained using the following repository:

Training Command

python scripts/training/finetune_codellama.py \
    --base-model codellama/CodeLlama-7B-Instruct-hf \
    --dataset datasets/processed/split_chat_format/train.jsonl \
    --output-dir training-outputs/codellama-fifo-v2-chat \
    --max-length 1536 \
    --num-epochs 5 \
    --batch-size 4 \
    --gradient-accumulation 4 \
    --learning-rate 2e-05 \
    --lora-r 48 \
    --lora-alpha 96 \
    --lora-dropout 0.15

Evaluation

Training Metrics

  • Final Training Loss: 0.626
  • Final Validation Loss: 0.609
  • Training Loss Progression:
    • Epoch 1: 1.113
    • Epoch 2: 0.996
    • Epoch 3: 0.834
    • Epoch 4: 0.704
    • Epoch 5: 0.626

Model Performance

The model demonstrates strong capability in:

  • Generating syntactically correct Verilog code
  • Following FIFO design specifications
  • Producing clean, synthesizable RTL
  • Handling various FIFO configurations (width, depth, features)

Limitations

  • Model size: Requires base CodeLlama-7B-Instruct model (~13GB)
  • Context window: Maximum sequence length of 1536 tokens
  • Domain: Specialized for FIFO designs, may not generalize to other RTL components
  • Language: Primarily English prompts

Citation

If you use this model, please cite:

@model{{codellama_fifo_elinnos}},
  title={CodeLlama-7B Fine-Tuned for Verilog FIFO Generation},
  author={Elinnos Systems},
  year={2025},
  url={https://huggingface.co/Elinnos/codellama-7b-fifo-verilog}
}

License

This model follows the same license as the base CodeLlama-7B-Instruct model (Llama 2 Community License).

Contact

For questions or issues, please contact:


Model trained by Elinnos Systems for Verilog FIFO code generation.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support