LLaMA-Factory Commands

LLaMA-Factory is an easy-to-use LLM fine-tuning framework supporting 100+ models with methods including SFT, RLHF, DPO, PPO, LoRA, QLoRA, and full fine-tuning. It provides both a web UI and CLI interface.

Installation

# Clone and install
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

# Or install with all extras
pip install -e ".[torch,metrics,deepspeed,bitsandbytes,vllm,gptq,awq,aqlm]"

# Verify installation
llamafactory-cli version

Web UI

# Launch the web training UI
llamafactory-cli webui

# Launch on specific port
GRADIO_SERVER_PORT=7860 llamafactory-cli webui

# Launch with share link for remote access
GRADIO_SHARE=1 llamafactory-cli webui

CLI Training

# Train with a YAML config
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

# Train with inline arguments
llamafactory-cli train \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --stage sft \
  --do_train true \
  --finetuning_type lora \
  --dataset alpaca_en \
  --template llama3 \
  --output_dir ./output/llama3-lora \
  --per_device_train_batch_size 2 \
  --gradient_accumulation_steps 4 \
  --lr_scheduler_type cosine \
  --learning_rate 5e-5 \
  --num_train_epochs 3 \
  --bf16 true \
  --lora_rank 8 \
  --lora_alpha 16

# Export / merge LoRA adapter with base model
llamafactory-cli export \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --adapter_name_or_path ./output/llama3-lora \
  --template llama3 \
  --finetuning_type lora \
  --export_dir ./output/llama3-merged \
  --export_size 5

Training Stages

# Supervised Fine-Tuning (SFT)
stage: sft
finetuning_type: lora
dataset: alpaca_en

# Reward Modeling
stage: rm
finetuning_type: lora
dataset: comparison_gpt4_en

# PPO (Proximal Policy Optimization)
stage: ppo
finetuning_type: lora
reward_model: ./output/reward_model
dataset: alpaca_en

# DPO (Direct Preference Optimization)
stage: dpo
finetuning_type: lora
dataset: comparison_gpt4_en

# KTO (Kahneman-Tversky Optimization)
stage: kto
finetuning_type: lora
dataset: kto_en

SFT Config Example

# examples/train_lora/llama3_lora_sft.yaml
model_name_or_path: meta-llama/Llama-3.1-8B-Instruct
stage: sft
do_train: true
finetuning_type: lora
lora_rank: 8
lora_alpha: 16
lora_target: all
dataset: alpaca_en
template: llama3
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: ./output/llama3_lora_sft
logging_steps: 10
save_steps: 500
per_device_train_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 5.0e-5
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
plot_loss: true

QLoRA Config

# QLoRA: 4-bit quantization + LoRA
model_name_or_path: meta-llama/Llama-3.1-8B-Instruct
stage: sft
do_train: true
finetuning_type: lora
quantization_bit: 4
quantization_method: bitsandbytes
lora_rank: 16
lora_alpha: 32
lora_target: all
dataset: alpaca_en
template: llama3
output_dir: ./output/llama3_qlora
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
bf16: true

Custom Dataset Format

[
  {
    "instruction": "Translate to French.",
    "input": "Hello, how are you?",
    "output": "Bonjour, comment allez-vous?"
  },
  {
    "instruction": "Summarize the following text.",
    "input": "Long article text here...",
    "output": "Summary of the article."
  }
]

{
  "my_custom_dataset": {
    "file_name": "my_data.json",
    "columns": {
      "prompt": "instruction",
      "query": "input",
      "response": "output"
    }
  }
}

Multi-GPU Training

# DeepSpeed ZeRO-2
FORCE_TORCHRUN=1 llamafactory-cli train \
  --deepspeed examples/deepspeed/ds_z2_config.json \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --stage sft \
  --finetuning_type lora \
  --dataset alpaca_en \
  --template llama3 \
  --output_dir ./output/llama3-ds

# DeepSpeed ZeRO-3 for large models
FORCE_TORCHRUN=1 llamafactory-cli train \
  --deepspeed examples/deepspeed/ds_z3_config.json \
  --model_name_or_path meta-llama/Llama-3.1-70B-Instruct \
  --stage sft \
  --finetuning_type lora \
  --dataset alpaca_en \
  --template llama3 \
  --output_dir ./output/llama3-70b-lora

# Specific GPU selection
CUDA_VISIBLE_DEVICES=0,1,2,3 FORCE_TORCHRUN=1 \
  llamafactory-cli train config.yaml

Evaluation and Inference

# Run evaluation
llamafactory-cli eval \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --adapter_name_or_path ./output/llama3-lora \
  --template llama3 \
  --finetuning_type lora \
  --task mmlu \
  --lang en

# Interactive chat
llamafactory-cli chat \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --adapter_name_or_path ./output/llama3-lora \
  --template llama3 \
  --finetuning_type lora

# Launch API server
llamafactory-cli api \
  --model_name_or_path meta-llama/Llama-3.1-8B-Instruct \
  --adapter_name_or_path ./output/llama3-lora \
  --template llama3 \
  --finetuning_type lora

Supported Model Templates

Model Family	Template Value
Llama 3 / 3.1	`llama3`
Mistral / Mixtral	`mistral`
Qwen 2 / 2.5	`qwen`
Gemma 2	`gemma`
Phi-3	`phi`
ChatGLM 3/4	`chatglm3` / `glm4`
Yi	`yi`
DeepSeek	`deepseek`
Baichuan 2	`baichuan2`
InternLM 2	`intern2`

Common Commands

Task	Command
Launch web UI	`llamafactory-cli webui`
Train with config	`llamafactory-cli train config.yaml`
Merge LoRA adapter	`llamafactory-cli export config.yaml`
Interactive chat	`llamafactory-cli chat config.yaml`
API server	`llamafactory-cli api config.yaml`
Evaluate model	`llamafactory-cli eval config.yaml`
Check version	`llamafactory-cli version`