ManishThota commited on
Commit
48d243e
·
verified ·
1 Parent(s): 11b3e83

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -4,13 +4,14 @@ license: creativeml-openrail-m
4
  ---
5
  <h1 align='center' style='font-size: 36px; font-weight: bold;'>Sparrow</h1>
6
  <h3 align='center' style='font-size: 24px;'>Tiny Vision Language Model</h3>
7
- <h4 align='center', style='font-size: 18px;' >A Custom Model Enhanced for Educational Contexts: This specialized model integrates slide-text pairs from machine learning classes, leveraging a unique training approach. It connects a frozen pre-trained vision encoder (SigLip) with a frozen language model (Phi-2) through an innovative projector. The model employs attention mechanisms and language modeling loss to deeply understand and generate educational content, specifically tailored to the context of machine learning education. </h4>
8
 
9
 
10
  <p align="center">
11
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650c7fbb8ffe1f53bdbe1aec/DTjDSq2yG-5Cqnk6giPFq.jpeg" width="50%" height="auto"/>
12
  </p>
13
 
 
 
14
  <p align='center' style='font-size: 16px;'>
15
  3B parameter model built by <a href="https://www.linkedin.com/in/manishkumarthota/">@Manish</a> using SigLIP, Phi-2, Language Modeling Loss, LLaVa data, and Custom setting training dataset.
16
  The model is released for research purposes only, commercial use is not allowed.
 
4
  ---
5
  <h1 align='center' style='font-size: 36px; font-weight: bold;'>Sparrow</h1>
6
  <h3 align='center' style='font-size: 24px;'>Tiny Vision Language Model</h3>
 
7
 
8
 
9
  <p align="center">
10
  <img src="https://cdn-uploads.huggingface.co/production/uploads/650c7fbb8ffe1f53bdbe1aec/DTjDSq2yG-5Cqnk6giPFq.jpeg" width="50%" height="auto"/>
11
  </p>
12
 
13
+ <h4 align='center', style='font-size: 16px;' >A Custom Model Enhanced for Educational Contexts: This specialized model integrates slide-text pairs from machine learning classes, leveraging a unique training approach. It connects a frozen pre-trained vision encoder (SigLip) with a frozen language model (Phi-2) through an innovative projector. The model employs attention mechanisms and language modeling loss to deeply understand and generate educational content, specifically tailored to the context of machine learning education. </h4>
14
+
15
  <p align='center' style='font-size: 16px;'>
16
  3B parameter model built by <a href="https://www.linkedin.com/in/manishkumarthota/">@Manish</a> using SigLIP, Phi-2, Language Modeling Loss, LLaVa data, and Custom setting training dataset.
17
  The model is released for research purposes only, commercial use is not allowed.