Update default readme file

Browse files

Files changed (1) hide show

README.md +259 -0

README.md CHANGED Viewed

@@ -1,3 +1,262 @@
 ---
 license: apache-2.0
 ---

 ---
+language: en
 license: apache-2.0
+library_name: tensorflow
+tags:
+  - tensorflow
+  - keras
+  - tflite
+  - emotion-recognition
+  - transformer
+  - lstm
+  - mediapipe
+  - computer-vision
+  - deep-learning
+  - facial-expression
+  - affective-computing
+  - sequential-data
+model-index:
+  - name: emotion_landmark_lstm_model
+    results:
+      - task:
+          type: sequence-classification
+        dataset:
+          type: dataset
+          name: Optimized 478-Point 3D Facial Landmark Dataset
+        metrics:
+          - name: accuracy
+            type: float
+            value: 0.7289
+    inference: "Supports TensorFlow and TensorFlow Lite real-time inference"
+---
+# 🎥 Emotion Sequence Transformer (TensorFlow) — Mediapipe 478 Landmarks (Seq256)
+**Version:** v1.0
+**Framework:** TensorFlow 2.x
+**Optimized format:** TensorFlow Lite
+**Input:** 478 Mediapipe Face Mesh landmarks per frame (up to 300 frames)
+**Output:** 6-class emotion prediction (`Angry`, `Disgust`, `Fear`, `Happy`, `Neutral`, `Sad`)
+---
+## 🧠 Model Overview
+The **Emotion Sequence Transformer** is a deep learning model built using TensorFlow for recognizing **human emotions** from continuous **video clips**.
+It uses **478 Mediapipe facial landmarks per frame** to capture spatiotemporal patterns of facial movements across time.
+The model predicts one of six basic emotions by analyzing both facial geometry and temporal variation within sequences of up to **300 frames**.
+This model is suitable for **real-time video-based emotion detection**, **affective computing**, **human-computer interaction**, and **emotion-aware AI systems**.
+---
+## 📊 Dataset
+This model was trained on the **[Optimized 478-Point 3D Facial Landmark Dataset](https://www.kaggle.com/datasets/psewmuthu/optimized-video-facial-landmarks)** —
+a dataset derived from the **Video Emotion Dataset**, optimized for emotion recognition using Mediapipe’s 3D face mesh landmarks.
+Each sample in the dataset includes:
+- Up to **300 frames per clip**
+- **478 facial landmarks per frame**
+- Corresponding **emotion label**
+---
+## 🧩 Model Architecture
+The architecture is based on a **Transformer encoder** design that processes sequential data of facial landmarks.
+**Pipeline:**
+1. Input normalization using precomputed mean and std (global stats)
+2. Sequence embedding via positional encodings
+3. Transformer encoder blocks to capture temporal and spatial dependencies
+4. Dense layers for emotion classification (6 output neurons with softmax)
+**Core Components:**
+- Transformer Encoder Layers (Multi-Head Self-Attention)
+- Layer Normalization and Dropout
+- Dense classification head
+---
+## 📈 Performance
+| Metric                | Value      |
+| --------------------- | ---------- |
+| **Test Accuracy**     | 0.7289     |
+| **Test Loss**         | 1.1336     |
+| **Macro F1-Score**    | 0.73       |
+| **Weighted F1-Score** | 0.73       |
+| **Max Clip Length**   | 300 frames |
+| **Input Shape**       | (300, 478) |
+### 🧾 Classification Report
+| Emotion              | Precision | Recall | F1-score            | Support |
+| -------------------- | --------- | ------ | ------------------- | ------- |
+| Angry                | 0.75      | 0.73   | 0.74                | 139     |
+| Disgust              | 0.88      | 0.70   | 0.78                | 128     |
+| Fear                 | 0.52      | 0.60   | 0.55                | 114     |
+| Happy                | 0.88      | 0.97   | 0.92                | 129     |
+| Neutral              | 0.66      | 0.79   | 0.72                | 101     |
+| Sad                  | 0.70      | 0.58   | 0.64                | 134     |
+| **Overall Accuracy** | **0.73**  |        | **Macro Avg: 0.73** | 745     |
+---
+## 📊 Visualizations
+### 🔹 Training Accuracy and Loss
+![Accuracy and Loss](images/Accuracies_and_Losses.png)
+### 🔹 Confusion Matrix
+![Confusion Matrix](images/Confusion_Matrix.png)
+### 🔹 ROC Curves (Per Class)
+![ROC Curves](images/ROC_Curves.png)
+---
+## 📂 Repository Structure
+```
+TF-Emotion-Sequence-Transformer/
+├── tf_emotion_sequence_transformer_mp478_seq256.h5
+├── tf_emotion_sequence_transformer_mp478_seq256_optimized.tflite
+├── tf_emotion-sequence-transformer-bilstm-usage.ipynb
+├── assets/
+│   ├── global_mean.npy
+│   ├── global_std.npy
+│   ├── label_encoder.pkl
+│   └── metadata.json
+└── README.md
+```
+### File Descriptions
+| File                                                            | Description                                                                                          |
+| --------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
+| `tf_emotion_sequence_transformer_mp478_seq256.h5`               | Main TensorFlow model trained on 478 landmarks (300 frames max).                                     |
+| `tf_emotion_sequence_transformer_mp478_seq256_optimized.tflite` | Optimized TensorFlow Lite version for deployment (mobile, edge).                                     |
+| `tf_emotion-sequence-transformer-bilstm-usage.ipynb`            | Example notebook demonstrating how to use the model for emotion prediction from Mediapipe landmarks. |
+| `assets/global_mean.npy`                                        | Precomputed global mean for normalization.                                                           |
+| `assets/global_std.npy`                                         | Precomputed global standard deviation for normalization.                                             |
+| `assets/label_encoder.pkl`                                      | Encoder mapping integer labels to emotion names.                                                     |
+| `assets/metadata.json`                                          | Model metadata and configuration details.                                                            |
+---
+## 🚀 Example Usage
+### 🔸 TensorFlow (.h5) Model
+```python
+import numpy as np
+import tensorflow as tf
+import joblib
+import json
+# Load Model
+model = tf.keras.models.load_model("tf_emotion_sequence_transformer_mp478_seq256.h5")
+# Load assets
+mean = np.load("assets/global_mean.npy")
+std = np.load("assets/global_std.npy")
+label_encoder = joblib.load("assets/label_encoder.pkl")
+# Preprocess input
+input_seq = np.load("example_input.npy")  # shape: (300, 478)
+input_seq = (input_seq - mean) / std
+input_seq = np.expand_dims(input_seq, axis=0)
+# Predict
+pred = model.predict(input_seq)
+emotion = label_encoder.inverse_transform([np.argmax(pred)])[0]
+print("Predicted Emotion:", emotion)
+```
+---
+### 🔸 TensorFlow Lite (Optimized) Model
+```python
+import numpy as np
+import tensorflow as tf
+import joblib
+# Load TFLite model
+interpreter = tf.lite.Interpreter(model_path="tf_emotion_sequence_transformer_mp478_seq256_optimized.tflite")
+interpreter.allocate_tensors()
+# Get input and output tensors
+input_details = interpreter.get_input_details()
+output_details = interpreter.get_output_details()
+# Load preprocessing assets
+mean = np.load("assets/global_mean.npy")
+std = np.load("assets/global_std.npy")
+label_encoder = joblib.load("assets/label_encoder.pkl")
+# Prepare input
+input_seq = np.load("example_input.npy")  # shape: (300, 478)
+input_seq = (input_seq - mean) / std
+input_seq = np.expand_dims(input_seq, axis=0).astype(np.float32)
+# Inference
+interpreter.set_tensor(input_details[0]['index'], input_seq)
+interpreter.invoke()
+pred = interpreter.get_tensor(output_details[0]['index'])
+# Decode emotion
+emotion = label_encoder.inverse_transform([np.argmax(pred)])[0]
+print("Predicted Emotion:", emotion)
+```
+---
+## 🔖 Version Information
+**Version:** v1.0
+**Date:** November 2025
+**Author:** [P.S. Abewickrama Singhe](https://www.kaggle.com/psewmuthu)
+**Framework:** TensorFlow 2.x
+**Exported Models:** `.h5`, `.tflite`
+**Landmarks per frame:** 478
+**Max frames per clip:** 300
+---
+## 🏷️ Tags
+`tensorflow` • `emotion-recognition` • `mediapipe` • `transformer` • `sequence-model` • `facial-landmarks` • `video-analysis` • `tflite` • `human-emotion-ai` • `affective-computing` • `computer-vision` • `deep-learning`
+---
+## 📚 Citation
+If you use this model in your research, please cite it as:
+```bibtex
+@misc{pasindu_sewmuthu_abewickrama_singhe_2025,
+	author       = { Pasindu Sewmuthu Abewickrama Singhe },
+	title        = { EmotionFormer-BiLSTM (Revision f329517) },
+	year         = 2025,
+	url          = { https://huggingface.co/PSewmuthu/EmotionFormer-BiLSTM },
+	doi          = { 10.57967/hf/6899 },
+	publisher    = { Hugging Face }
+}
+```
+---
+## 🪪 License
+This model is released under the **Apache 2.0 License** — free for academic and commercial use with attribution.
 ---