--- license: llama3.1 base_model: meta-llama/Meta-Llama-3.1-8B-Instruct tags: - llama3.1 - fine-tuned - merged - peft - lora language: - en pipeline_tag: text-generation --- # Merged LLaMA 3.1 8B Model This model is a merged version of the base LLaMA 3.1 8B Instruct model with LoRA fine-tuning. ## Model Details - **Base Model**: meta-llama/Meta-Llama-3.1-8B-Instruct - **LoRA Model**: atacod/llama-3.1-8b-test-finetuning - **Model Type**: Causal Language Model - **Architecture**: LLaMA 3.1 ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "your-username/your-repo-name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" ) # Example usage messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Your prompt here"} ] input_ids = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ) outputs = model.generate( input_ids, max_new_tokens=256, temperature=0.7, top_p=0.9, do_sample=True ) response = tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` ## Training Details This model was fine-tuned using LoRA (Low-Rank Adaptation) technique and then merged with the base model for easier deployment. ## Limitations Please refer to the base model limitations and use responsibly.