Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference Paper • 2405.18628 • Published May 28, 2024 • 1