我们蒸馏模型的过程中会要到 huggingface.co 下载底模和数据文件,有必要单独拿出来说一下

安装


uv venv --python 3.12
source .venv/bin/activate

uv pip install huggingface_hub hf_transfer

hf  auth login --token hf_xxxxx

#使用国内加速站下载
HF_ENDPOINT=https://hf-mirror.com hf download

# 下载所有文件,直接下载
hf download google/gemma-4-1b-it        --local-dir ./gemma-4-1b-it
hf download google/translategemma-4b-it --local-dir ./translategemma-4b

# 下载多个文件,不指定下载目录
# 文件会放到 
# ~/.cache/huggingface/hub/models--lmstudio-community--Qwen3.5-9B-GGUF/snapshots/1379f25c6b505a3fc737bd7818cb09389cf807c1/Qwen3.5-9B-Q4_K_M.gguf \
# ~/.cache/huggingface/hub/models--lmstudio-community--Qwen3.5-9B-GGUF/snapshots/1379f25c6b505a3fc737bd7818cb09389cf807c1/mmproj-Qwen3.5-9B-BF16.gguf \
hf download lmstudio-community/Qwen3.5-9B-GGUF Qwen3.5-9B-Q4_K_M.gguf mmproj-Qwen3.5-9B-BF16.gguf --revision main 

# 下载多个文件,指定下载目录
uv tool run hf download facebook/m2m100_418M     config.json     vocab.json     sentencepiece.bpe.model     special_tokens_map.json     tokenizer_config.json     pytorch_model.bin     --local-dir Translate/m2m100

# 下载单个文件,指定下载目录
hf download Jackrong/Qwopus3.5-9B-v3-GGUF --local-dir Jackrong/Qwopus3.5-9B-v3-GGUF Qwopus3.5-9B-v3.Q4_K_M.gguf

# 下载无限制的gemma4
uv tool run hf download  HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf mmproj-Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-f16.gguf

# 下载所有Q4和多模
hf download unsloth/gemma-4-26B-A4B-it-GGUF \
    --local-dir unsloth/gemma-4-26B-A4B-it-GGUF \
    --include "*mmproj-BF16*" \
    --include "*UD-Q4_K_XL*" # 动态 2 位请使用 "*UD-Q2_K_XL*"

那一些非常好的训练数据集:

https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered

opus 4.6的数据

https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x

opus 4.5的数据

https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x

qwen 3.5的数据