YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

run phi3-mini on AMD NPU

If no phi3_mini_awq_4bit_no_flash_attention.pt, use awq quantization to get the quantization model.
Put modeling_phi3.py in this repo into the phi-3-mini folder.
Modify the file path in the run_awq.py
run python run_awq.py --task decode --target aie --w_bit 4

PS: The performance is similar to that on CPU(7640hs).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support