Let's Talk about AI

#1
by kalashshah19 - opened
Indian AI Developers org
edited Aug 22

Hello, here is an open space for everyone to talk, share, ask and show anything about AI.

kalashshah19 pinned discussion
Indian AI Developers org

Has anyone pre-trained LLM model from scratch ? If yes then share your experience, things to consider while training, notes, tips etc.

Indian AI Developers org

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Indian AI Developers org

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Hey @Shashank2k3 , if you want your own LLM model, first you need huge data. You can start with fine tuning already available good LLM models like Gemma, Phi, LLAMA, mistral etc with your dataset. Start with small models of sizes like 4 to 7B parameters. For pre-training LLM from scratch you need enormous data, good resources like heavy duty GPUs and CPUs and also have knowledge of training techniques, NLP, etc . You can always brainstorm with ChatGPT to get more knowledge.

Indian AI Developers org

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Indian AI Developers org

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Great !

Indian AI Developers org

Yupp so what you guys do, i mean profession!!!

Indian AI Developers org

Yupp so what you guys do, i mean profession!!!

I am an Associate Data Scientist at Casepoint.
What about you ?

Indian AI Developers org

I am an Associate Software Developer at Brillius Technologies

Indian AI Developers org

I am an Associate Software Developer at Brillius Technologies

Great, where are you from and where is the company ?

Indian AI Developers org

I am from hyderabad , company origin is calfornia pleasonton , but i am working at hyderabad branch.

Indian AI Developers org

I am from hyderabad , company origin is calfornia pleasonton , but i am working at hyderabad branch.

Okay, so do u work in AI ML there ?

Indian AI Developers org

Hello guys

Indian AI Developers org

Hello guys

Yo whatsup ?

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

I am great. Where do u work ?

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

I am great. Where do u work ?

I work at HelpingAI

Indian AI Developers org

Yeah previously my company is in it staffing side , just 6 months ago they have started in IT solutions to build product called brillius AI tutor where we are developing an online learning platform for it professionals, in that I'm taking care of many things where AIML is one of those things.

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

I am great. Where do u work ?

I work at HelpingAI

Are you talking about this ? - https://huggingface.co/HelpingAI
Is it a company or Open Source Community ?

Indian AI Developers org

Yeah previously my company is in it staffing side , just 6 months ago they have started in IT solutions to build product called brillius AI tutor where we are developing an online learning platform for it professionals, in that I'm taking care of many things where AIML is one of those things.

Wow, great. Is it a startup or big company ?

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

I am great. Where do u work ?

I work at HelpingAI

Are you talking about this ? - https://huggingface.co/HelpingAI
Is it a company or Open Source Community ?

A small startup

Indian AI Developers org

Hello guys

Yo whatsup ?

fine and what about u?

I am great. Where do u work ?

I work at HelpingAI

Are you talking about this ? - https://huggingface.co/HelpingAI
Is it a company or Open Source Community ?

A small startup

Okay and is the link for the Company correct ?

Indian AI Developers org

yes

Indian AI Developers org
edited Sep 10

yes

Is it your company (startup)? I mean are you owner or founder ?

Indian AI Developers org

yes

Is it your company (startup)? I mean are you owner or founder ?

Yes, I am its founder.

Indian AI Developers org

yes

Is it your company (startup)? I mean are you owner or founder ?

Yes, I am its founder.

Great, keep up the good work and whats your vision ?

Indian AI Developers org
                                                                                                                                                            
                                                                                                                                                            
HHHHHHHHH     HHHHHHHHH                   lllllll                      iiii                                                     AAA               IIIIIIIIII
H:::::::H     H:::::::H                   l:::::l                     i::::i                                                   A:::A              I::::::::I
H:::::::H     H:::::::H                   l:::::l                      iiii                                                   A:::::A             I::::::::I
HH::::::H     H::::::HH                   l:::::l                                                                            A:::::::A            II::::::II
  H:::::H     H:::::H      eeeeeeeeeeee    l::::lppppp   ppppppppp   iiiiiiinnnn  nnnnnnnn       ggggggggg   ggggg          A:::::::::A             I::::I  
  H:::::H     H:::::H    ee::::::::::::ee  l::::lp::::ppp:::::::::p  i:::::in:::nn::::::::nn    g:::::::::ggg::::g         A:::::A:::::A            I::::I  
  H::::::HHHHH::::::H   e::::::eeeee:::::eel::::lp:::::::::::::::::p  i::::in::::::::::::::nn  g:::::::::::::::::g        A:::::A A:::::A           I::::I  
  H:::::::::::::::::H  e::::::e     e:::::el::::lpp::::::ppppp::::::p i::::inn:::::::::::::::ng::::::ggggg::::::gg       A:::::A   A:::::A          I::::I  
  H:::::::::::::::::H  e:::::::eeeee::::::el::::l p:::::p     p:::::p i::::i  n:::::nnnn:::::ng:::::g     g:::::g       A:::::A     A:::::A         I::::I  
  H::::::HHHHH::::::H  e:::::::::::::::::e l::::l p:::::p     p:::::p i::::i  n::::n    n::::ng:::::g     g:::::g      A:::::AAAAAAAAA:::::A        I::::I  
  H:::::H     H:::::H  e::::::eeeeeeeeeee  l::::l p:::::p     p:::::p i::::i  n::::n    n::::ng:::::g     g:::::g     A:::::::::::::::::::::A       I::::I  
  H:::::H     H:::::H  e:::::::e           l::::l p:::::p    p::::::p i::::i  n::::n    n::::ng::::::g    g:::::g    A:::::AAAAAAAAAAAAA:::::A      I::::I  
HH::::::H     H::::::HHe::::::::e         l::::::lp:::::ppppp:::::::pi::::::i n::::n    n::::ng:::::::ggggg:::::g   A:::::A             A:::::A   II::::::II
H:::::::H     H:::::::H e::::::::eeeeeeee l::::::lp::::::::::::::::p i::::::i n::::n    n::::n g::::::::::::::::g  A:::::A               A:::::A  I::::::::I
H:::::::H     H:::::::H  ee:::::::::::::e l::::::lp::::::::::::::pp  i::::::i n::::n    n::::n  gg::::::::::::::g A:::::A                 A:::::A I::::::::I
HHHHHHHHH     HHHHHHHHH    eeeeeeeeeeeeee llllllllp::::::pppppppp    iiiiiiii nnnnnn    nnnnnn    gggggggg::::::gAAAAAAA                   AAAAAAAIIIIIIIIII
                                                  p:::::p                                                 g:::::g                                           
                                                  p:::::p                                     gggggg      g:::::g                                           
                                                 p:::::::p                                    g:::::gg   gg:::::g                                           
                                                 p:::::::p                                     g::::::ggg:::::::g                                           
                                                 p:::::::p                                      gg:::::::::::::g                                            
                                                 ppppppppp                                        ggg::::::ggg                                              
                                                                                                     gggggg                                                 

                                                               
        🚀 Keep building, keep innovating!
Indian AI Developers org

@kalashshah19 its a medium scale company , but for me its a startup cause we are the first team of IT , and we dont have any seniors to guide us

Indian AI Developers org
edited Sep 10

yes

Is it your company (startup)? I mean are you owner or founder ?

Yes, I am its founder.

Great, keep up the good work and whats your vision ?

Making agi token and time efficient

Indian AI Developers org

Btw try Dhanishtha 2.0 preview
HelpingAI.co

Indian AI Developers org

@kalashshah19 its a medium scale company , but for me its a startup cause we are the first team of IT , and we dont have any seniors to guide us

ohh, I see

Indian AI Developers org

yes

Is it your company (startup)? I mean are you owner or founder ?

Yes, I am its founder.

Great, keep up the good work and whats your vision ?

Making agi token and time efficient

Cool

Indian AI Developers org

Btw try Dhanishtha 2.0 preview
HelpingAI.co

Sure I will !

Indian AI Developers org

hey bros

Indian AI Developers org

hey bros

yo whatsup !

Indian AI Developers org

Can we create an interface for our community for easy chatting

Indian AI Developers org

Yeah Sure, good idea !

Indian AI Developers org
edited Sep 13

Shashank2k3/Fake-Profile-Detection-Instagram, please check out my project Fake Profile Detection, where I have trained a random forest model using my own dataset consisting of metadata from various fake and real profiles on Instagram.

Indian AI Developers org

Can we create an interface for our community for easy chatting

Great idea! We should create a Discord server

Indian AI Developers org

Shashank2k3/Fake-Profile-Detection-Instagram, please check out my project Fake Profile Detection, where I have trained a random forest model using my own dataset consisting of metadata from various fake and real profiles on Instagram.

Nice, good effort !

Indian AI Developers org

Can we create an interface for our community for easy chatting

Great idea! We should create a Discord server

So you are suggesting that the HuggingFace community should chat in Discord community 😂
I think he told to create an inference on HuggingFace for chatting.

Indian AI Developers org

Can we create an interface for our community for easy chatting

Great idea! We should create a Discord server

So you are suggesting that the HuggingFace community should chat in Discord community 😂
I think he told to create an inference on HuggingFace for chatting.

😂

Indian AI Developers org

Hello everyone, I'm a 2nd year DSAI student at IIT Guwahati, i have fine-tuned few models and i have published few articles on ResearchGate. I am determined to build state-of-the-art AI in India for the world.
Its really great to be a part of this community.
Thank you

Indian AI Developers org
edited Sep 24

Hello everyone, I'm a 2nd year DSAI student at IIT Guwahati, i have fine-tuned few models and i have published few articles on ResearchGate. I am determined to build state-of-the-art AI in India for the world.
Its really great to be a part of this community.
Thank you

Hey Ashish, its great to have you in our community. We all will grow as Indian AI Devs and learn new things from each other. Keep in loop !
And also remember that we all are friends so no need to be formal !

Indian AI Developers org

Btw try Dhanishtha 2.0 preview
HelpingAI.co

i have used it (spaces), i even shared it with my friends

Indian AI Developers org

has anyone contributed datasets or models on AIkosh?

Indian AI Developers org
edited Sep 26

https://huggingface.co/Neural-Hacker/Qwen-BharatBench-Legal
please try this and share your feedback (i'm working on another version to make it even better)

Indian AI Developers org

sure !

Indian AI Developers org

sure !

please add it in the collection if you like it

Indian AI Developers org

Hi everyone :)
I created a dataset on Bhagavad Gita which happens to be the most liked dataset and most downloaded dataset on HuggingFace. I'd appreciate if you make something out of it.
Link :https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita_Dataset

Indian AI Developers org

Hi everyone :)
I created a dataset on Bhagavad Gita which happens to be the most liked dataset and most downloaded dataset on HuggingFace. I'd appreciate if you make something out of it.
Link :https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita_Dataset

already liked it and will use in a planned project

Indian AI Developers org

sure !

please add it in the collection if you like it

Done !

Indian AI Developers org

Hi everyone :)
I created a dataset on Bhagavad Gita which happens to be the most liked dataset and most downloaded dataset on HuggingFace. I'd appreciate if you make something out of it.
Link :https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita_Dataset

Woah great !

Indian AI Developers org

🚀 Big news from XenArcAI!

We’ve just released our new dataset: Bhagwat‑Gita‑Infinity 🌸📖

✨ What’s inside:

  • Verse‑aligned Sanskrit, Hindi, and English
  • Clean, structured, and ready for ML/AI projects
  • Perfect for research, education, and open‑source exploration

🔗 Hugging Face: https://huggingface.co/datasets/XenArcAI/Bhagwat-Gita-Infinity

Let’s bring timeless wisdom into modern AI together 🙌

I hope you all love this dataset and contribute positively to AI/ML Research

Indian AI Developers org

🚀 Big news from XenArcAI!

We’ve just released our new dataset: Bhagwat‑Gita‑Infinity 🌸📖

✨ What’s inside:

  • Verse‑aligned Sanskrit, Hindi, and English
  • Clean, structured, and ready for ML/AI projects
  • Perfect for research, education, and open‑source exploration

🔗 Hugging Face: https://huggingface.co/datasets/XenArcAI/Bhagwat-Gita-Infinity

Let’s bring timeless wisdom into modern AI together 🙌

I hope you all love this dataset and contribute positively to AI/ML Research

Great, congratulations !

Indian AI Developers org

Hey everyone 👋

I've been working on this dataset since last 15 days and it's finally done. I'm pleased to announce that I have created the first QnA dataset for Bhagavad Gita not only in English but also in Hindi and Gujarati.

What’s inside:

  • Verse‑aligned English, Hindi and Gujarati questions
  • Each verse is paired with 5 question types exploring different aspects.
  • Perfect for blending Spirituality and Technology

Link : https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita-QA

(P.S: I'm the one who created the most liked and most downloaded dataset for Bhagavad Gita)

Indian AI Developers org

Hey everyone👋

I fine-tuned Qwen3-0.6B on the soketlabs/bhasha-wiki-indic dataset ( i used only ~50k Hindi samples ). The training went fine, loss was around 1.46 and accuracy about 59% but the outputs are completely wrong and make no sense.

I think the issue might be with the dataset format since it’s plain text and not instruction-based so the model probably didn’t learn proper Q&A or instruction-following.

Has anyone faced this before? Should I switch to something like ai4bharat/indic-instruct-data-v0.1 for better results? Any suggestions? 🙏

Indian AI Developers org

Same bro, I ft qwen2.5 3b with my gita dataset and the acc was 60% something with loss around 1.42 and the responses were okayish but still it was writing correct hindi but when I made a space out of it, it's just writing anything

Indian AI Developers org

Same bro, I ft qwen2.5 3b with my gita dataset and the acc was 60% something with loss around 1.42 and the responses were okayish but still it was writing correct hindi but when I made a space out of it, it's just writing anything

Yeah exactly bro, I thought good loss & accuracy meant it was learning properly but turns out it's more about the data structure. My model also writes proper Hindi but gives totally unrelated or random outputs. I think because plain text data only teaches it next-word prediction and not how to follow instructions or answer properly.

I’m planning to try instruction-style data next (ai4bharat/indic-instruct-data-v0.1). Maybe that’ll fix it. Have you tried using instruction-tuned datasets or formatting your data as Q&A pairs? How you fixed that issue?

Indian AI Developers org

SarathiAI v1.0
Eternal Gita wisdom, guided by AI.

I tried fine-tuning Qwen2.5-3B model on a custom Bhagavad Gita dataset for the first time. The model now understands user queries more accurately and responds with answers grounded in Gita teachings. It's not perfect but I'm happy with everything I learned building it. Please give it a try and you can share what should I improve.

Link to Space: https://huggingface.co/spaces/JDhruv14/Sarathi.AI

Link to Model: https://huggingface.co/JDhruv14/Qwen2.5-3B-Gita-FT

Link to Dataset: https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita-QA

Indian AI Developers org
edited 26 days ago

Well, I am feeling good seeing your work guys.

@JDhruv14 , @Neural-Hacker

Indian AI Developers org

guys, i am building a model (fine-tuning), i want a lot of compute and vram please tell where i can find free or cheap gpu because aws, azure, etc are very costly and maybe kaggle won't work here but i will try it tomorrow. i am thinking to use ola cloud but if there's any platform better than ola then please suggest.

Indian AI Developers org

Hello everyone,

I’m excited to share NEET_BioBERT , a fine-tuned lightweight transformer model trained specifically on NEET-style biology multiple-choice questions.
It’s designed for educational AI assistants, practice exam bots and MCQ reasoning systems.

Explore it here: https://huggingface.co/Neural-Hacker/NEET_BioBERT

use and upvote it

Indian AI Developers org

Is it fine tuned on question papers only or all content of NEET like books, PDFs, etc ?

Indian AI Developers org

it is fine tuned on a dataset consisting ~800 questions including practice questions and pyqs

Indian AI Developers org

Nice !

Sign up or log in to comment