--- library_name: transformers pipeline_tag: text-generation # Added—crucial for widget base_model: meta-llama/Llama-3.2-3B tags: - text-generation - mdcat - medical - education license: apache-2.0 --- # MDCAT-Llama3.2-3B This is a 4-bit quantized version of LLaMA 3.2 3B, fine-tuned for answering MDCAT (Medical and Dental College Admission Test) questions. It uses Parameter-Efficient Fine-Tuning (PEFT) with QLoRA to provide accurate responses to MDCAT-related queries in biology, chemistry, physics, and medical topics, while refusing non-MDCAT questions. ## Model Details ### Model Description Designed to assist MDCAT students, this model delivers precise answers within its domain and rejects off-topic queries. It’s quantized to 4-bit precision for efficiency. - **Developed by:** abdullah1101 - **Model type:** Text generation (causal language model) - **Language(s):** English - **License:** Apache 2.0 - **Finetuned from:** meta-llama/Llama-3.2-3B - **Size:** 2.35GB (4-bit quantized) # Clarifies quantization ### Model Sources - **Repository:** https://huggingface.co/abdullah1101/MDCAT-Llama3.2-3B ## Uses ### Direct Use Use via the Hugging Face Inference API (once processed) or load locally for MDCAT question-answering. #### Local Usage Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer from transformers import BitsAndBytesConfig quant_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16) model = AutoModelForCausalLM.from_pretrained( "abdullah1101/MDCAT-Llama3.2-3B", quantization_config=quant_config, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("abdullah1101/MDCAT-Llama3.2-3B") inputs = tokenizer("Question: What is the function of the liver?\nAnswer: ", return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=200, pad_token_id=tokenizer.eos_token_id) print(tokenizer.decode(outputs[0], skip_special_tokens=True))