Update README.md
Browse files
README.md
CHANGED
|
@@ -4,13 +4,14 @@ license: creativeml-openrail-m
|
|
| 4 |
---
|
| 5 |
<h1 align='center' style='font-size: 36px; font-weight: bold;'>Sparrow</h1>
|
| 6 |
<h3 align='center' style='font-size: 24px;'>Tiny Vision Language Model</h3>
|
| 7 |
-
<h4 align='center', style='font-size: 18px;' >A Custom Model Enhanced for Educational Contexts: This specialized model integrates slide-text pairs from machine learning classes, leveraging a unique training approach. It connects a frozen pre-trained vision encoder (SigLip) with a frozen language model (Phi-2) through an innovative projector. The model employs attention mechanisms and language modeling loss to deeply understand and generate educational content, specifically tailored to the context of machine learning education. </h4>
|
| 8 |
|
| 9 |
|
| 10 |
<p align="center">
|
| 11 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/650c7fbb8ffe1f53bdbe1aec/DTjDSq2yG-5Cqnk6giPFq.jpeg" width="50%" height="auto"/>
|
| 12 |
</p>
|
| 13 |
|
|
|
|
|
|
|
| 14 |
<p align='center' style='font-size: 16px;'>
|
| 15 |
3B parameter model built by <a href="https://www.linkedin.com/in/manishkumarthota/">@Manish</a> using SigLIP, Phi-2, Language Modeling Loss, LLaVa data, and Custom setting training dataset.
|
| 16 |
The model is released for research purposes only, commercial use is not allowed.
|
|
|
|
| 4 |
---
|
| 5 |
<h1 align='center' style='font-size: 36px; font-weight: bold;'>Sparrow</h1>
|
| 6 |
<h3 align='center' style='font-size: 24px;'>Tiny Vision Language Model</h3>
|
|
|
|
| 7 |
|
| 8 |
|
| 9 |
<p align="center">
|
| 10 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/650c7fbb8ffe1f53bdbe1aec/DTjDSq2yG-5Cqnk6giPFq.jpeg" width="50%" height="auto"/>
|
| 11 |
</p>
|
| 12 |
|
| 13 |
+
<h4 align='center', style='font-size: 16px;' >A Custom Model Enhanced for Educational Contexts: This specialized model integrates slide-text pairs from machine learning classes, leveraging a unique training approach. It connects a frozen pre-trained vision encoder (SigLip) with a frozen language model (Phi-2) through an innovative projector. The model employs attention mechanisms and language modeling loss to deeply understand and generate educational content, specifically tailored to the context of machine learning education. </h4>
|
| 14 |
+
|
| 15 |
<p align='center' style='font-size: 16px;'>
|
| 16 |
3B parameter model built by <a href="https://www.linkedin.com/in/manishkumarthota/">@Manish</a> using SigLIP, Phi-2, Language Modeling Loss, LLaVa data, and Custom setting training dataset.
|
| 17 |
The model is released for research purposes only, commercial use is not allowed.
|