Spaces:

IndianAIDevs
/

README

Running

App Files Files Community

Let's Talk about AI

pinned

by kalashshah19 - opened Aug 20

Discussion

kalashshah19

Indian AI Developers org Aug 20

•

edited Aug 22

Hello, here is an open space for everyone to talk, share, ask and show anything about AI.

kalashshah19 pinned discussion Aug 20

kalashshah19

Indian AI Developers org Aug 22

Has anyone pre-trained LLM model from scratch ? If yes then share your experience, things to consider while training, notes, tips etc.

Shashank2k3

Indian AI Developers org Aug 22

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

kalashshah19

Indian AI Developers org Aug 22

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Hey @Shashank2k3 , if you want your own LLM model, first you need huge data. You can start with fine tuning already available good LLM models like Gemma, Phi, LLAMA, mistral etc with your dataset. Start with small models of sizes like 4 to 7B parameters. For pre-training LLM from scratch you need enormous data, good resources like heavy duty GPUs and CPUs and also have knowledge of training techniques, NLP, etc . You can always brainstorm with ChatGPT to get more knowledge.

Shashank2k3

Indian AI Developers org Aug 24

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

kalashshah19

Indian AI Developers org Aug 26

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Great !

Shashank2k3

Indian AI Developers org Aug 31

Yupp so what you guys do, i mean profession!!!

kalashshah19

Indian AI Developers org Sep 1

Yupp so what you guys do, i mean profession!!!

I am an Associate Data Scientist at Casepoint.
What about you ?

66 hidden messages

Expand all

Neural-Hacker

Indian AI Developers org 27 days ago

Same bro, I ft qwen2.5 3b with my gita dataset and the acc was 60% something with loss around 1.42 and the responses were okayish but still it was writing correct hindi but when I made a space out of it, it's just writing anything

Yeah exactly bro, I thought good loss & accuracy meant it was learning properly but turns out it's more about the data structure. My model also writes proper Hindi but gives totally unrelated or random outputs. I think because plain text data only teaches it next-word prediction and not how to follow instructions or answer properly.

I’m planning to try instruction-style data next (ai4bharat/indic-instruct-data-v0.1). Maybe that’ll fix it. Have you tried using instruction-tuned datasets or formatting your data as Q&A pairs? How you fixed that issue?

JDhruv14

Indian AI Developers org 24 days ago

SarathiAI v1.0
Eternal Gita wisdom, guided by AI.

I tried fine-tuning Qwen2.5-3B model on a custom Bhagavad Gita dataset for the first time. The model now understands user queries more accurately and responds with answers grounded in Gita teachings. It's not perfect but I'm happy with everything I learned building it. Please give it a try and you can share what should I improve.

Link to Space: https://huggingface.co/spaces/JDhruv14/Sarathi.AI

Link to Model: https://huggingface.co/JDhruv14/Qwen2.5-3B-Gita-FT

Link to Dataset: https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita-QA

kalashshah19

Indian AI Developers org 22 days ago

•

edited 22 days ago

Well, I am feeling good seeing your work guys.

@JDhruv14 , @Neural-Hacker

Neural-Hacker

Indian AI Developers org 13 days ago

guys, i am building a model (fine-tuning), i want a lot of compute and vram please tell where i can find free or cheap gpu because aws, azure, etc are very costly and maybe kaggle won't work here but i will try it tomorrow. i am thinking to use ola cloud but if there's any platform better than ola then please suggest.

Neural-Hacker

Indian AI Developers org 2 days ago

Hello everyone,

I’m excited to share NEET_BioBERT , a fine-tuned lightweight transformer model trained specifically on NEET-style biology multiple-choice questions.
It’s designed for educational AI assistants, practice exam bots and MCQ reasoning systems.

Explore it here: https://huggingface.co/Neural-Hacker/NEET_BioBERT

use and upvote it

kalashshah19

Indian AI Developers org 1 day ago

Is it fine tuned on question papers only or all content of NEET like books, PDFs, etc ?

Neural-Hacker

Indian AI Developers org 1 day ago

it is fine tuned on a dataset consisting ~800 questions including practice questions and pyqs

kalashshah19

Indian AI Developers org about 12 hours ago

Nice !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment