LLaMA-Factory Commands
LLaMA-Factory is an easy-to-use LLM fine-tuning framework supporting 100+ models with methods including SFT, RLHF, DPO, PPO, LoRA, QLoRA, and full fine-tuning. It provides both a web UI and CLI interface.
Installation
# Clone and install
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
# Or install with all extras
pip install -e ".[torch,metrics,deepspeed,bitsandbytes,vllm,gptq,awq,aqlm]"
# Verify installation
llamafactory-cli version
Web UI
# Launch the web training UI
llamafactory-cli webui
# Launch on specific port
GRADIO_SERVER_PORT=7860 llamafactory-cli webui
# Launch with share link for remote access
GRADIO_SHARE=1 llamafactory-cli webui
CLI Training
# Train with a YAML config
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
# Train with inline arguments
llamafactory-cli train \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--stage sft \
--do_train true \
--finetuning_type lora \
--dataset alpaca_en \
--template llama3 \
--output_dir ./output/llama3-lora \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 4 \
--lr_scheduler_type cosine \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--bf16 true \
--lora_rank 8 \
--lora_alpha 16
# Export / merge LoRA adapter with base model
llamafactory-cli export \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--adapter_name_or_path ./output/llama3-lora \
--template llama3 \
--finetuning_type lora \
--export_dir ./output/llama3-merged \
--export_size 5
Training Stages
# Supervised Fine-Tuning (SFT)
stage: sft
finetuning_type: lora
dataset: alpaca_en
# Reward Modeling
stage: rm
finetuning_type: lora
dataset: comparison_gpt4_en
# PPO (Proximal Policy Optimization)
stage: ppo
finetuning_type: lora
reward_model: ./output/reward_model
dataset: alpaca_en
# DPO (Direct Preference Optimization)
stage: dpo
finetuning_type: lora
dataset: comparison_gpt4_en
# KTO (Kahneman-Tversky Optimization)
stage: kto
finetuning_type: lora
dataset: kto_en
SFT Config Example
# examples/train_lora/llama3_lora_sft.yaml
model_name_or_path: meta-llama/Llama-3.1-8B-Instruct
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_alpha: 16
lora_target: all
dataset: alpaca_en
template: llama3
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: ./output/llama3_lora_sft
logging_steps: 10
save_steps: 500
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 5.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
plot_loss: true
QLoRA Config
# QLoRA: 4-bit quantization + LoRA
model_name_or_path: meta-llama/Llama-3.1-8B-Instruct
stage: sft
do_train: true
finetuning_type: lora
quantization_bit: 4
quantization_method: bitsandbytes
lora_rank: 16
lora_alpha: 32
lora_target: all
dataset: alpaca_en
template: llama3
output_dir: ./output/llama3_qlora
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
bf16: true
Custom Dataset Format
[
{
"instruction": "Translate to French.",
"input": "Hello, how are you?",
"output": "Bonjour, comment allez-vous?"
},
{
"instruction": "Summarize the following text.",
"input": "Long article text here...",
"output": "Summary of the article."
}
]
{
"my_custom_dataset": {
"file_name": "my_data.json",
"columns": {
"prompt": "instruction",
"query": "input",
"response": "output"
}
}
}
Register custom datasets in data/dataset_info.json.
Multi-GPU Training
# DeepSpeed ZeRO-2
FORCE_TORCHRUN=1 llamafactory-cli train \
--deepspeed examples/deepspeed/ds_z2_config.json \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--stage sft \
--finetuning_type lora \
--dataset alpaca_en \
--template llama3 \
--output_dir ./output/llama3-ds
# DeepSpeed ZeRO-3 for large models
FORCE_TORCHRUN=1 llamafactory-cli train \
--deepspeed examples/deepspeed/ds_z3_config.json \
--model_name_or_path meta-llama/Llama-3.1-70B-Instruct \
--stage sft \
--finetuning_type lora \
--dataset alpaca_en \
--template llama3 \
--output_dir ./output/llama3-70b-lora
# Specific GPU selection
CUDA_VISIBLE_DEVICES=0,1,2,3 FORCE_TORCHRUN=1 \
llamafactory-cli train config.yaml
Evaluation and Inference
# Run evaluation
llamafactory-cli eval \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--adapter_name_or_path ./output/llama3-lora \
--template llama3 \
--finetuning_type lora \
--task mmlu \
--lang en
# Interactive chat
llamafactory-cli chat \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--adapter_name_or_path ./output/llama3-lora \
--template llama3 \
--finetuning_type lora
# Launch API server
llamafactory-cli api \
--model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
--adapter_name_or_path ./output/llama3-lora \
--template llama3 \
--finetuning_type lora
Supported Model Templates
| Model Family | Template Value |
|---|---|
| Llama 3 / 3.1 | llama3 |
| Mistral / Mixtral | mistral |
| Qwen 2 / 2.5 | qwen |
| Gemma 2 | gemma |
| Phi-3 | phi |
| ChatGLM 3/4 | chatglm3 / glm4 |
| Yi | yi |
| DeepSeek | deepseek |
| Baichuan 2 | baichuan2 |
| InternLM 2 | intern2 |
Common Commands
| Task | Command |
|---|---|
| Launch web UI | llamafactory-cli webui |
| Train with config | llamafactory-cli train config.yaml |
| Merge LoRA adapter | llamafactory-cli export config.yaml |
| Interactive chat | llamafactory-cli chat config.yaml |
| API server | llamafactory-cli api config.yaml |
| Evaluate model | llamafactory-cli eval config.yaml |
| Check version | llamafactory-cli version |