Xilin Jiang's picture

4 24 1

Xilin Jiang

xi-j

·

xi-j

AI & ML interests

None yet

Organizations

authored 6 papers 9 months ago

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis

Paper • 2407.09732 • Published Jul 13, 2024 • 10

Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation

Paper • 2408.11849 • Published Aug 13, 2024

Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue

Paper • 2409.04927 • Published Sep 7, 2024

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

Paper • 2409.10058 • Published Sep 16, 2024 • 2

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Paper • 2309.09493 • Published Sep 18, 2023

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

Paper • 2502.16794 • Published Feb 24 • 5

authored 7 papers over 1 year ago

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

Paper • 2205.07390 • Published May 15, 2022

Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

Paper • 2301.08810 • Published Jan 20, 2023

SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model

Paper • 2405.11831 • Published May 20, 2024 • 1

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience

Paper • 2402.03710 • Published Feb 6, 2024

Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation

Paper • 2403.18257 • Published Mar 27, 2024 • 1

Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation

Paper • 2309.15938 • Published Sep 27, 2023

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

Paper • 2305.18441 • Published May 29, 2023