gpt2-medium355M-sft / README.md

Himanshu13x

Update README.md

8ae7352 verified 4 months ago

preview code

raw

history blame contribute delete

901 Bytes

metadata

license: apache-2.0
language:
  - en
base_model:
  - openai-community/gpt2-medium
pipeline_tag: text-generation

GPT-2 (From Scratch in PyTorch) — Fine-Tuned Version

This model is a custom GPT-2 implementation built entirely from scratch in PyTorch (no Hugging Face Transformers for the architecture itself) and fine-tuned on a custom dataset using Supervised Fine-Tuning (SFT).

Model Details

Architecture: GPT-2 (from scratch)
Variants Supported: gpt2-small, gpt2-medium, gpt2-large, gpt2-xl
Framework: PyTorch
Pretraining Source: Loaded GPT-2 pretrained weights from OpenAI format
Fine-Tuning Method: Supervised Fine-Tuning (SFT)
Fine-Tuning Data: Custom dataset (domain-specific; see dataset section)
Tokenization: GPT-2 tokenizer style

Repo

https://github.com/Himanshu-00/gpt2-from-scratch-pytorch.git