Smoliakov's picture

Smoliakov PRO

Yehor

·

https://t.me/doing_something

AI & ML interests

Speech-to-Text, Text-to-Speech, Voice over Internet Protocol

Recent Activity

updated a Space about 9 hours ago

Yehor/w2v-bert-uk-v2.1

liked a dataset 7 days ago

sarulab-speech/mls_sidon

liked a model 8 days ago

nvidia/parakeet_realtime_eou_120m-v1

View all activity

Organizations

upvoted 2 papers about 1 month ago

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Paper • 2510.13996 • Published Oct 15 • 7

CST5: Data Augmentation for Code-Switched Semantic Parsing

Paper • 2211.07514 • Published Nov 14, 2022 • 1

upvoted a collection about 1 month ago

MT5 release

The MT5 release follows the T5 family, but is pretrained on multilingual data. The update UMT5 models are pretrained on an updated corpus. • 10 items • Updated Jul 10 • 22

upvoted an article about 1 month ago

Article

Introducing the Polish ASR Leaderboard (PAL) and Benchmark Intended Grouping of Open Speech (BIGOS) Corpora

Jul 10, 2024

•

4

upvoted a paper about 1 month ago

The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP

Paper • 2510.05644 • Published Oct 7 • 23

upvoted 2 articles 4 months ago

Article

Test-Driving the LLMD Inference Engine by ZML 🚀

Jul 18

•

24

Article

Automated Discovery of High-Performance GPU Kernels with OpenEvolve

Jun 27

•

23

upvoted a collection 4 months ago

H-Net

The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 • 8 items • Updated Jul 11 • 20

upvoted a collection 5 months ago

OmniGEC

This is a collection of multilingual silver-standard datasets and models for the task of Grammatical Error Correction (GEC). • 9 items • Updated Sep 19 • 8

upvoted an article 8 months ago

Article

Boosting Wav2Vec2 with n-grams in 🤗 Transformers

Jan 12, 2022

•

15

upvoted an article 9 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

•

471

upvoted 5 collections 9 months ago

Gemma 3

All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 55 items • Updated 23 days ago • 93

MT Quality Estimation

Models for reference-free quality estimation of machine translation • 10 items • Updated Jan 29 • 4

GTE models

General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 21 items • Updated Jan 21 • 32

Ukrainian Speech-to-Text models

4 items • Updated Jun 4 • 1

OWLS: Scaling Laws for Speech Recognition and Translation

🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. 16k sampling rate. • 8 items • Updated May 3 • 7

upvoted an article 9 months ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

Feb 11

•

33

upvoted 3 collections 9 months ago

NeMo Curator - Classifier Models

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 1 day ago • 24

Ukrainian Text-to-Speech datasets

Five voices: Mykyta, Oleksa, Lada, Kateryna or Tetiana • 6 items • Updated Feb 26 • 4

Crimean Tatar Text-to-Speech datasets

Three voices: Abibullah, Sevil, or Arslan • 4 items • Updated May 27 • 2