| | --- |
| | language: |
| | - en |
| | tags: |
| | - sql |
| | - text-to-sql |
| | - daraz |
| | - llama3 |
| | - unsloth |
| | - ecommerce |
| | license: apache-2.0 |
| | datasets: |
| | - custom |
| | base_model: unsloth/llama-3-8b-bnb-4bit |
| | --- |
| | |
| | # drz-sql-llama3 |
| |
|
| | This model is a fine-tuned version of Llama 3 (8B) for generating SQL queries specific to the Daraz e-commerce platform. |
| |
|
| | ## Model Description |
| |
|
| | - **Base Model:** Llama 3 8B (4-bit quantized) |
| | - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
| | - **Training Data:** 20 Daraz-specific SQL query examples |
| | - **Use Case:** Converting natural language questions to SQL queries for Daraz analytics |
| |
|
| | ## Training Details |
| |
|
| | - **Framework:** Unsloth |
| | - **LoRA Rank:** 16 |
| | - **Training Steps:** 100 |
| | - **Batch Size:** 2 |
| | - **Gradient Accumulation:** 4 |
| | - **Learning Rate:** 0.0002 |
| |
|
| | ## Key Features |
| |
|
| | This model understands Daraz-specific: |
| | - Table schemas (e.g., `daraz_cdm.dwd_drz_trd_core_df`, `daraz_cdm.dwd_drz_prd_sku_extension`) |
| | - Business logic (Choice classification, KAM assignments, industry mapping) |
| | - Query patterns (MAX_PT for partitions, DATEADD for date filtering) |
| | - Metrics (GMV, L7/L30 calculations, order types) |
| | |
| | ## Usage |
| | |
| | ```python |
| | from unsloth import FastLanguageModel |
| | |
| | # Load model |
| | model, tokenizer = FastLanguageModel.from_pretrained( |
| | model_name = "Bilal326/drz-sql-llama3", |
| | max_seq_length = 2048, |
| | dtype = None, |
| | load_in_4bit = True, |
| | ) |
| | |
| | FastLanguageModel.for_inference(model) |
| | |
| | # Generate SQL |
| | alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
| |
|
| | ### Instruction: |
| | {} |
| |
|
| | ### Input: |
| | {} |
| |
|
| | ### Response: |
| | {}""" |
| |
|
| | prompt = alpaca_prompt.format( |
| | "Generate SQL for the following request:", |
| | "Get total GMV for last 30 days in Pakistan", |
| | "" |
| | ) |
| | |
| | inputs = tokenizer([prompt], return_tensors="pt").to("cuda") |
| | outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5) |
| | print(tokenizer.decode(outputs[0])) |
| | ``` |
| | |
| | ## Example Queries |
| | |
| | The model can handle: |
| | - Simple aggregations: "Get total GMV and orders for last 30 days" |
| | - Complex joins: "Get seller performance with KAM assignments" |
| | - Time-based analysis: "Show monthly GMV trend by industry" |
| | - Advanced logic: "Compare Choice vs Non-Choice GMV in Crossborder" |
| | |
| | ## Limitations |
| | |
| | - Trained specifically for Daraz schema and business logic |
| | - May not generalize to other SQL dialects or schemas |
| | - Requires Daraz-specific tables to be available |
| | |
| | ## Training Dataset |
| | |
| | Custom dataset of 20 SQL query examples covering: |
| | - Revenue and GMV analysis |
| | - Product performance metrics |
| | - Seller segmentation |
| | - Category and brand analysis |
| | - Time-based trends |
| | |
| | ## Citation |
| | |
| | If you use this model, please cite: |
| | |
| | ``` |
| | @misc{drz-sql-llama3, |
| | author = {Bilal326}, |
| | title = {drz-sql-llama3: Daraz SQL Generation Model}, |
| | year = {2025}, |
| | publisher = {HuggingFace}, |
| | url = {https://huggingface.co/Bilal326/drz-sql-llama3} |
| | } |
| | ``` |
| | |
| | ## Acknowledgments |
| | |
| | - Built with [Unsloth](https://github.com/unslothai/unsloth) |
| | - Based on Meta's Llama 3 |
| | - Fine-tuned for Daraz e-commerce analytics |
| | |