yangwang825
/

xvector-voxceleb1

@@ -1,39 +1,25 @@
 ---
 library_name: transformers
 tags:
-- audio-classification
 - generated_from_trainer
 datasets:
 - voxceleb
 metrics:
 - accuracy
 model-index:
-- name: ce-len3-bs256-lr1e-3
-  results:
-  - task:
-      name: Audio Classification
-      type: audio-classification
-    dataset:
-      name: confit/voxceleb
-      type: voxceleb
-      config: verification
-      split: train
-      args: verification
-    metrics:
-    - name: Accuracy
-      type: accuracy
-      value: 0.9410023545240498
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# ce-len3-bs256-lr1e-3
-This model is a fine-tuned version of [](https://huggingface.co/) on the confit/voxceleb dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2946
-- Accuracy: 0.9410
 ## Model description
@@ -66,16 +52,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 4.6728        | 1.0   | 523  | 4.3456          | 0.1504   |
-| 3.224         | 2.0   | 1046 | 2.2589          | 0.5141   |
-| 2.3964        | 3.0   | 1569 | 1.4663          | 0.6836   |
-| 1.8474        | 4.0   | 2092 | 0.9548          | 0.7927   |
-| 1.5275        | 5.0   | 2615 | 0.6698          | 0.8571   |
-| 1.248         | 6.0   | 3138 | 0.5270          | 0.8899   |
-| 1.0991        | 7.0   | 3661 | 0.4500          | 0.9037   |
-| 0.9221        | 8.0   | 4184 | 0.3572          | 0.9267   |
-| 0.7997        | 9.0   | 4707 | 0.3138          | 0.9353   |
-| 0.7603        | 10.0  | 5230 | 0.2946          | 0.9410   |
 ### Framework versions

 ---
 library_name: transformers
 tags:
 - generated_from_trainer
 datasets:
 - voxceleb
 metrics:
 - accuracy
 model-index:
+- name: xvector-voxceleb1
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# xvector-voxceleb1
+This model is a fine-tuned version of [](https://huggingface.co/) on the voxceleb dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2981
+- Accuracy: 0.9405
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 4.6869        | 1.0   | 523  | 4.1199          | 0.1960   |
+| 3.2423        | 2.0   | 1046 | 2.2824          | 0.5047   |
+| 2.4164        | 3.0   | 1569 | 1.4862          | 0.6816   |
+| 1.8625        | 4.0   | 2092 | 0.9794          | 0.7917   |
+| 1.5637        | 5.0   | 2615 | 0.7048          | 0.8490   |
+| 1.265         | 6.0   | 3138 | 0.5389          | 0.8862   |
+| 1.0888        | 7.0   | 3661 | 0.4364          | 0.9101   |
+| 0.9296        | 8.0   | 4184 | 0.3617          | 0.9265   |
+| 0.8066        | 9.0   | 4707 | 0.3207          | 0.9353   |
+| 0.7675        | 10.0  | 5230 | 0.2981          | 0.9405   |
 ### Framework versions

configuration_xvector.py CHANGED Viewed

@@ -158,7 +158,7 @@ class XVectorConfig(PretrainedConfig):
         # Decoder configuration
         self.emb_sizes = emb_sizes
         self.pool_mode = pool_mode
-        self.angular = True if objective in ['additive_angular_margin'] else False
         self.attention_channels = attention_channels
         self.decoder_config = {
             "feat_in": filters[-1],

         # Decoder configuration
         self.emb_sizes = emb_sizes
         self.pool_mode = pool_mode
+        self.angular = True if objective in ['additive_angular_margin', 'additive_margin'] else False
         self.attention_channels = attention_channels
         self.decoder_config = {
             "feat_in": filters[-1],