Post
226
Building high-performance, reproducible kernels for AMD ROCm just got a lot easier.
I've put together a guide on building, testing, and sharing ROCm-compatible kernels using the Hugging Face
Learn how to:
- Use Nix for reproducible builds
- Integrate kernels as native PyTorch operators
- Share your kernels on the Hub for anyone to use with
We use the 🏆 award-winning RadeonFlow GEMM kernel as a practical example.
📜 Check out the full guide here : https://huggingface.co/blog/build-rocm-kernels
I've put together a guide on building, testing, and sharing ROCm-compatible kernels using the Hugging Face
kernel-builder and kernels libraries; so you can focus on optimizing performance rather than spending time on setup.Learn how to:
- Use Nix for reproducible builds
- Integrate kernels as native PyTorch operators
- Share your kernels on the Hub for anyone to use with
kernels.get_kernel()We use the 🏆 award-winning RadeonFlow GEMM kernel as a practical example.
📜 Check out the full guide here : https://huggingface.co/blog/build-rocm-kernels