Базовая модель была дообучена на неофицильные тексты с большим количеством смайликов, разговорных слов и выражений
Дообучение производилось с помощью DPO
Примеры генерации:
user: How are you doing?
assistant: Yay for getting back into the swing of things! It's been a while since I last saw you, but I'm still feeling a bit lost. I'm so glad you're back in the swing of life! 🎉 Here's what I'm doing to help you get back on track:\n\nStart small: Don't try to tackle everything at once. Break it down into smaller, manageable chunks. For example, if you're feeling overwhelmed with social media, start with a few posts per day and gradually work your way up to more.\n\nFind your why: What's driving you? What's your purpose in life? What's your passion? Write it down! What's the big idea you want to take to the next level? What's the one thing you want to achieve? This will help you connect with your values and make it more tangible.\n\nPractice self-care: Take care of your physical, emotional, and mental well-being. Exercise, eat healthy, and get enough sleep. It's like a workout for your brain!\n\nConnect with others: Meet new people, join a club or group, or attend a community event.
user: Write recipe for apple pie
assistant: Yay for apple pie! Here's a simple recipe to make your taste buds dance with the sweet, buttery goodness:\n\nThe Classic Apple Pie Recipe\n\nIngredients:\n\n 2 1/4 cups all-purpose flour\n 1 tsp salt\n 1/2 cup granulated sugar\n 1/2 cup unsalted butter, at room temperature\n* 1/2 cup whole milk\n* 2 large eggs\n* 2 tsp pure vanilla extract\n* 1/2 cup chopped walnuts or pecans (optional)\n\nOptional ingredients:\n\n 1/4 cup brown sugar\n 1/4 cup granulated sugar\n 1/4 cup brown sugar\n 1/4 cup brown sugar\n* 1/4 cup brown sugar\n* 1/4 cup chopped pecans or walnuts (optional)\n\n**Instructions:**\n\n1. Preheat your oven: Set your oven to 350°F (180°C).\n2. Make the crust: In a large bowl, whisk together flour, salt, and sugar. Add the eggs one at a time, and whisk until the eggs are fully incorporated. Add the milk, salt, and vanilla extract.\n3. Whisk in the butter: In a separate bowl, beat the butter until it's light and fluffy. Add the flour, salt, and sugar.\n4. Add the eggs: Add the eggs one at a time, beating well after each addition.\n5. Add the vanilla: Beat in the vanilla extract.\n6. Combine the dry ingredients: Add the flour, salt, and sugar.
user: Write your plan for day
assistant: Time to get moving! Here's a 5-day plan to help you get the most out of your day:\n\nDay 1: Wake-Up and Stretch (10:00 AM - 12:00 PM)\n\n Get some morning sunlight and a good cup of hot tea or a warm cup of herbal tea.\n Take a few deep breaths, stretch your arms, legs, and back, and get a head start on the day.\n Grab a quick lunch or grab a quick snack to fuel up for the day.\n\nDay 2: Breakfast (12:00 PM - 1:00 PM)\n\n Grab a quick breakfast like a granola bar, fruit, or a bowl of oatmeal.\n Grab a quick lunch like a sandwich, salad, or a burger.\n Grab a quick snack like a fruit or a piece of fruit.\n* Grab a quick lunch like a sandwich, salad, or a burger.\n\nDay 3: Work or Exercise (1:00 PM - 3:00 PM)\n\n Grab a quick lunch like a sandwich, salad, or a burger.\n Grab a quick workout like a yoga or a brisk walk.\n Grab a quick snack like a fruit or a piece of fruit.\n Grab a quick lunch like a sandwich, salad, or a burger.\n\n**Day 4: Dinner (3:00 PM - 5:00 PM)*\n\n Grab a quick lunch like a sandwich, salad, or a burger.
Пример запуска:
from transformers import AutoTokenizer, AutoModelForCausalLM
device = 'cuda'
messages = [{"role": "user", "content": "How are you doing?"}]
tokenizer = AutoTokenizer.from_pretrained("Yegor25/llm-course-hw2-dpo")
input_ids = tokenizer.apply_chat_template(messages, max_length=1024, truncation=True, add_generation_prompt=True, return_tensors='pt').to(device)
model_dpo = AutoModelForCausalLM.from_pretrained("Yegor25/llm-course-hw2-dpo").to(device)
generated_ids = model_dpo.generate(input_ids, max_new_tokens=512, do_sample=False)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- Downloads last month
- -
Model tree for Yegor25/llm-course-hw2-dpo
Base model
HuggingFaceTB/SmolLM-135M