AI & ML interests
Resources, tools and content from Arm and our partner ecosystem that enable you to deploy your workloads quickly, efficiently and securely.
Recent Activity
Articles
Accelerate AI model deployment from cloud to edge
Arm on Hugging Face helps developers deploy Hugging Face models faster with optimized performance on Arm-based devices and platforms. Our guides, tools, and learning paths show how Arm integrates with major operating systems and frameworks, making it easier to build, optimize, and scale AI models across real-world use cases from cloud to edge, gaming to mobile.
Follow our curated Learning Paths to:
- Explore Arm-optimized AI models available in our Hugging Face Model Collections
- Use libraries and ML frameworks like PyTorch, ExecuTorch, llama.cpp, ONNX Runtime, and KleidiAI.
- Streamline your journey - from discovery to deployment – across AI use cases like real-time chatbots, sentiment analysis, neural graphics, object detection and more.
What can I build with Arm on Hugging Face?
Explore curated learning paths using Hugging Face models, optimised to run on platforms like Raspberry Pi, smartphones, and Arm-based cloud servers.
Neural Graphics
| Learning Path | Frameworks & Tools Used | Model(s) Featured | Market Application | Examples | Arm Learning Path |
|---|---|---|---|---|---|
| Neural Super Sampling in Unreal Engine | NSS Plugin for Unreal® Unreal® NNE Plugin for ML extensions for Vulkan Neural Graphics Model Gym |
Neural Super Sampling (NSS) | Smartphone | Graphics upscaling Enchanted Castle Demo |
Run NSS in Unreal → |
Generative AI
| Learning Path | Frameworks & Tools Used | Model(s) Featured | Market Application | Examples | Arm Learning Path |
|---|---|---|---|---|---|
| Build a RAG application | Zilliz Cloud, llama.cpp | All MiniLM L6 V2 | Cloud & Datacenter | Document retrieval + Q&A pipelines | Build with Zilliz → |
| Accelerate NLP models for faster inference | PyTorch, KleidiAI | DistilBERT Base Uncased SST-2 | Cloud & Datacenter | Sentiment analysis, text classification | Accelerate NLP → |
| Deploy an LLM chatbot with optimised performance | llama.cpp, KleidiAI | Dolphin 2.9.4, Llama 3.1 8B GGUF | Cloud & Datacenter | Real-time chatbots, enterprise assistants | Deploy with llama.cpp → |
| Run an LLM chatbot with PyTorch | PyTorch, Torchchat, Streamlit, KleidiAI | Llama 3.1 8B Instruct | Cloud & Datacenter | Inference pipelines with PyTorch | Run with PyTorch → |
| Deploy a RAG chatbot on Google Axion processors | llama-cpp-python, Faiss, KleidiAI, | Llama 3.1 8B GGUF | Cloud & Datacenter | RAG-based assistants at cloud scale | Deploy with Axion → |
| Build an Android chat app | ExecuTorch, XNNPACK, KleidiAI | Llama 3.2 1B Instruct | Smartphone | On-device chat apps | Build on Android → |
| Run Llama 3 on Raspberry Pi 5 | ExecuTorch | Llama 3.1 8B | Raspberry Pi | Edge LLM deployment | Run Llama 3 on Pi 5 → |
CV: Image Classification & Object Detection
| Learning Path | Frameworks & Tools Used | Model(s) Featured | Market Application | Examples | Arm Learning Path |
|---|---|---|---|---|---|
| Profile AI/ML performance on mobile apps | ExecuTorch, Arm Performance Studio, Android Studio Profiler | MobileNet V2 1.0 224 | Smartphone | App performance benchmarking | Profile mobile apps → |
| Run CV models on microcontrollers | Himax MCU, Arm toolchain | YOLOv8 | IoT | Object detection on MCUs | Run on MCU → |
| Export PyTorch models for edge devices | PyTorch, ExecuTorch | DistilBERT Base Uncased SST-2 | IoT | Deploy compact AI models on MCUs | Export with ExecuTorch → |
Sentiment Analysis
| Learning Path | Frameworks & Tools Used | Model(s) Featured | Market Application | Examples | Arm Learning Path |
|---|---|---|---|---|---|
| Accelerate NLP models from Hugging Face on Arm servers | PyTorch | DistilBERT Base Uncased SST-2 | Cloud & Datacenter | Text classification, sentiment analysis | Accelerate NLP on Arm → |
Speed Up AI Model Inference with Arm Kleidi
Arm Kleidi, comprising KleidiAI and KleidiCV, delivers out-of-the-box AI acceleration across popular frameworks – such as Pytorch, llama.cpp, MediaPipe (via XNNPACK), ONNX Runtime, and more – by integrating highly optimised micro-kernels tailored to Arm CPU architectures.
These lightweight libraries use advanced Arm instructions like Neon, SVE, and SME to deliver faster inference - with no code changes, retraining, or extra tooling. Developers get immediate performance gains while continuing to use familiar frameworks.
What You Can Do with Arm Kleidi:
- Accelerate Hugging Face models on real hardware
- Boost performance for computer vision, NLP, and generative AI workloads
- Use your existing models - no retraining required
- Integrate with familiar frameworks and runtimes
- Optimise for cloud, mobile, edge, and microcontroller platforms
Key Resources:
- Arm KleidiAI GitLab repo – Supports general-purpose AI acceleration
- Arm KleidiCV GitLab repo – Optimisation for computer vision models
- Arm Compute Library on GitHub – Low-level acceleration for AI software
Get started
Note: The data collated here is sourced from Arm and third parties. While Arm uses reasonable efforts to keep this information accurate, Arm does not warrant (express or implied) or provide any guarantee of data correctness due to the ever-evolving AI and software landscape. Any links to third-party sites and resources are provided for ease and convenience. Your use of such third-party sites and resources is subject to the third party’s terms of use, and use is at your own risk.
-
meta-llama/Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 10M • • 5.34k -
meta-llama/Llama-3.2-1B-Instruct
Text Generation • 1B • Updated • 2.68M • • 1.27k -
dphn/dolphin-2.9.4-llama3.1-8b
8B • Updated • 8.32k • 97 -
chatpdflocal/llama3.1-8b-gguf
8B • Updated • 1.84k • 29
-
meta-llama/Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 10M • • 5.34k -
meta-llama/Llama-3.2-1B-Instruct
Text Generation • 1B • Updated • 2.68M • • 1.27k -
dphn/dolphin-2.9.4-llama3.1-8b
8B • Updated • 8.32k • 97 -
chatpdflocal/llama3.1-8b-gguf
8B • Updated • 1.84k • 29