Spaces:

DontPlanToEnd
/

UGI-Leaderboard

Running

App Files Files Community

670

Eval requests: G4 E4B/MoE ARA vs SOMA

#650

by MuXodious - opened about 1 month ago

Discussion

MuXodious

about 1 month ago

•

edited 30 days ago

Hey, DontPlanToEnd, lad. I have risen from the dead to nag you with the following models. As you are aware, I have a thing for the tiny E4B multimodal model and comparing, contrasting different ablation methods. They were cooked basically on the same set of markers and score about the same on the PIQA benchmark. However, UGI should shed a brighter light for drawing conclusions.
https://huggingface.co/MuXodious/gemma-4-E4B-it-ARA-heresy

https://huggingface.co/MuXodious/gemma-4-E4B-it-SOMPOA-heresy

Highly optional at this time, ~~as I'm yet to and unsure when to cook its comparate~~ I had it done.:

https://huggingface.co/MuXodious/gemma-4-26B-A4B-it-ARA-heresy

https://huggingface.co/MuXodious/gemma-4-26B-A4B-it-SOMPOA-heresy

MuXodious changed discussion title from Eval requests to Eval requests: G4 E4B/MoE ARA vs SOMA 17 days ago

DontPlanToEnd

Owner about 2 hours ago

MuXodious/gemma-4-E4B-it-ARA-heresy and MuXodious/gemma-4-E4B-it-SOMPOA-heresy are giving me errors like this for some reason:
"""
ValueError: Following weights were not initialized from checkpoint: {'language_model.model.layers.33.self_attn.k_norm.weight', 'language_model.model.layers.31.self_attn.k_norm.weight', 'language_model.model.layers.38.self_attn.k_norm.weight'
...
"""
Added the 26Bs

DontPlanToEnd changed discussion status to closed about 2 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment