Separate audio into stems using various models
Generate analysis and response based on policy and prompt