arxiv:2410.13287
Hu Xiaoyan
Yann1021
AI & ML interests
Reinforcement learning
Recent Activity
authored
a paper
about 2 months ago
Provably Efficient CVaR RL in Low-rank MDPs
authored
a paper
about 2 months ago
PAK-UCB Contextual Bandit: An Online Learning Approach to Prompt-Aware
Selection of Generative Models and LLMs
authored
a paper
about 2 months ago
A Multi-Armed Bandit Approach to Online Selection and Evaluation of
Generative Models
Organizations
None yet