说明文档在我们关于llama.cpp的指南中:https://docs.unsloth.ai/basics/qwen3howtorunandfinetune/qwen32507运行命令如下:
./llama.cpp/llamacli
model unsloth/Qwen3235BA22BThinking2507GGUF/UDQ2_K_XL/Qwen3235BA22BThinking2507UDQ2_K_XL00001of00002.gguf
threads 32
ctxsize 16384
ngpulayers 99
ot ".ffn_._exps.=CPU"
seed 3407
prio 3
temp 0.6
minp 0.0
topp 0.95
topk 20
repeatpenalty 1.05 |