llama-4-scout-17b-16e-instruct-gguf
- base model from meta-llama
 - tested on gguf-connector with nightly llama-cpp-python
 
example workflow (run it locally)
- download the different parts of the model; for example q2_k
 llama-4-scout-17b-16e-it-q2_k-00001-of-00004.ggufllama-4-scout-17b-16e-it-q2_k-00002-of-00004.ggufllama-4-scout-17b-16e-it-q2_k-00003-of-00004.ggufllama-4-scout-17b-16e-it-q2_k-00004-of-00004.gguf- pull them all into an empty folder; then execute the merge command: 
ggc m2 - the merged gguf is around 36.8GB for q2_k (setup once)
 - execute connector command: 
ggc gpporggc cpp - select the merged gguf then start your prompt to interact with llama4
 
for model larger than 50GB in total
- don't need to merge (linked already); just execute: 
ggc gpp(orggc cppfor ui) - select the first part of the model (i.e., 00001-of-xxxxx)
 - start your prompt to interact with llama4
 
- Downloads last month
 - 727
 
							Hardware compatibility
						Log In
								
								to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
	Inference Providers
	NEW
	
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support
Model tree for chatpig/llama-4-scout-17b-16e-it-gguf
Base model
meta-llama/Llama-4-Scout-17B-16E