--- language: - en license: llama2 tags: - code-generation - verilog - systemverilog - rtl - hardware-design - fifo - semiconductor - ip-design - lora - codellama base_model: codellama/CodeLlama-7B-Instruct-hf datasets: - Elinnos/verilog-fifo-dataset library_name: peft pipeline_tag: text-generation --- # CodeLlama-7B Fine-Tuned for Verilog FIFO Generation ## Model Description This is a **LoRA fine-tuned CodeLlama-7B-Instruct** model specialized in generating clean, synthesizable Verilog/SystemVerilog code for FIFO (First-In-First-Out) designs. The model was trained on comprehensive FIFO specifications following industry best practices for RTL design. **Model Type:** Causal Language Model (LoRA Adapter) **Base Model:** [codellama/CodeLlama-7B-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7B-Instruct-hf) **Training Date:** 2025-11-25 **Organization:** Elinnos Systems ## Key Features ✅ **Complete Code Generation** - Generates full Verilog modules with proper structure ✅ **Synthesizable RTL** - Clean, synthesizable code following best practices ✅ **FIFO Variants** - Supports various FIFO architectures (synchronous, asynchronous, parameterized) ✅ **Protocol Compliance** - Proper signal handling, full/empty flags, error handling ✅ **Production-Ready** - No comments, no debug statements, clean functional code ## Training Details ### Dataset - **Training Samples:** 70 - **Validation Samples:** 9 - **Test Samples:** 15 - **Total Samples:** 94 - **Dataset Format:** CodeLlama chat template format - **Average Sequence Length:** 1536 tokens ### Training Configuration - **Base Model:** codellama/CodeLlama-7B-Instruct-hf - **Training Method:** LoRA (Low-Rank Adaptation) - **Quantization:** 4-bit NF4 (for GPU memory efficiency) - **Training Steps:** 25 steps (5 epochs) - **Epochs:** 5 - **Batch Size:** 4 - **Effective Batch Size:** 16 (with gradient accumulation) - **Learning Rate:** 2e-05 - **Max Sequence Length:** 1536 - **Hardware:** NVIDIA A100-SXM4-40GB ### LoRA Configuration ```python from peft import LoraConfig LoraConfig( r=48, # Rank lora_alpha=96, # Scaling factor target_modules=['q_proj', 'up_proj', 'down_proj', 'gate_proj', 'o_proj', 'k_proj', 'v_proj'], lora_dropout=0.15, bias="none", task_type="CAUSAL_LM" ) ``` **Target Modules:** - `q_proj` - `up_proj` - `down_proj` - `gate_proj` - `o_proj` - `k_proj` - `v_proj` **Trainable Parameters:** ~672M (LoRA adapter only) ## Model Capabilities ### What This Model Can Do 1. **Synchronous FIFO Design** - Dual-port memory architecture - Write/Read pointer management - Full/Empty flag generation - Configurable width and depth - Error handling (write_err, read_err) - Threshold flags (almost_full, almost_empty) - Occupancy output 2. **Asynchronous FIFO Design** - Gray code pointer synchronization - Clock domain crossing (CDC) - Separate read/write clocks - Data valid signals 3. **Advanced Features** - Parameterized FIFO modules - AXI-like handshake protocols - Pipelined output stages - Peek capability - Clear signal support 4. **Code Quality** - Clean, synthesizable RTL - No comments or debug statements - Proper signal naming conventions - Modular design structure ## Usage ### Installation ```bash pip install transformers peft torch bitsandbytes accelerate ``` ### Loading the Model ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel from transformers import BitsAndBytesConfig # Load base model with 4-bit quantization base_model = AutoModelForCausalLM.from_pretrained( "codellama/CodeLlama-7B-Instruct-hf", quantization_config=BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, ), device_map="auto", trust_remote_code=True ) # Load fine-tuned LoRA adapter model = PeftModel.from_pretrained( base_model, "Elinnos/codellama-7b-fifo-verilog" ) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7B-Instruct-hf") if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token tokenizer.pad_token_id = tokenizer.eos_token_id ``` ### Generating Code ```python # Create prompt using CodeLlama chat template system_prompt = "You are Elinnos RTL Code Generator v1.0, a specialized Verilog/SystemVerilog code generation agent. Your role: Generate clean, synthesizable RTL code for hardware design tasks. Output ONLY functional RTL code with no $display, assertions, comments, or debug statements." user_prompt = "Generate a synchronous FIFO with 8-bit data width, depth 4, write_enable, read_enable, full flag, empty flag." messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt} ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) # Generate inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=800, temperature=0.3, do_sample=True, top_p=0.9, repetition_penalty=1.2, eos_token_id=tokenizer.eos_token_id ) # Decode response response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=False) print(response) ``` ### Using the Simple Inference Script ```bash # From the codellama-migration directory python3 simple_finetuned_model_inference.py \ --prompt "Generate a synchronous FIFO with 32-bit data width, depth 16" \ --model-path "Elinnos/codellama-7b-fifo-verilog" ``` ## Training Code The model was trained using the following repository: - **Repository:** [Elinnos/codellama-fifo-finetuning](https://huggingface.co/Elinnos/codellama-fifo-finetuning) - **Training Script:** `scripts/training/finetune_codellama.py` ### Training Command ```bash python scripts/training/finetune_codellama.py \ --base-model codellama/CodeLlama-7B-Instruct-hf \ --dataset datasets/processed/split_chat_format/train.jsonl \ --output-dir training-outputs/codellama-fifo-v2-chat \ --max-length 1536 \ --num-epochs 5 \ --batch-size 4 \ --gradient-accumulation 4 \ --learning-rate 2e-05 \ --lora-r 48 \ --lora-alpha 96 \ --lora-dropout 0.15 ``` ## Evaluation ### Training Metrics - **Final Training Loss:** 0.626 - **Final Validation Loss:** 0.609 - **Training Loss Progression:** - Epoch 1: 1.113 - Epoch 2: 0.996 - Epoch 3: 0.834 - Epoch 4: 0.704 - Epoch 5: 0.626 ### Model Performance The model demonstrates strong capability in: - Generating syntactically correct Verilog code - Following FIFO design specifications - Producing clean, synthesizable RTL - Handling various FIFO configurations (width, depth, features) ## Limitations - Model size: Requires base CodeLlama-7B-Instruct model (~13GB) - Context window: Maximum sequence length of 1536 tokens - Domain: Specialized for FIFO designs, may not generalize to other RTL components - Language: Primarily English prompts ## Citation If you use this model, please cite: ```bibtex @model{{codellama_fifo_elinnos}}, title={CodeLlama-7B Fine-Tuned for Verilog FIFO Generation}, author={Elinnos Systems}, year={2025}, url={https://huggingface.co/Elinnos/codellama-7b-fifo-verilog} } ``` ## License This model follows the same license as the base CodeLlama-7B-Instruct model (Llama 2 Community License). ## Contact For questions or issues, please contact: - **Organization:** Elinnos Systems - **Repository:** [Elinnos/codellama-fifo-finetuning](https://huggingface.co/Elinnos) --- *Model trained by Elinnos Systems for Verilog FIFO code generation.*