ModelsProviders
Qwen 3 8B
Fine-tune the open-source Qwen 3 8B model with QLoRA on Commissioned's GPUs.
Qwen 3 8B runs on Commissioned's own GPU infrastructure using QLoRA (Quantized Low-Rank Adaptation). It's fundamentally different from the cloud providers — training is fast, and you get a portable adapter you can self-host.
How it differs
| Cloud models (OpenAI/Gemini) | Qwen 3 8B | |
|---|---|---|
| Training time | 30–45 minutes | ~5 minutes |
| Output | Hosted model | Hosted model + downloadable LoRA adapter |
| Self-hosting | Not possible | Yes — download and run anywhere |
| Model size | Large (proprietary) | 8B parameters (open-source) |
| Quality | Higher ceiling | Good for focused tasks |
LoRA adapters
Qwen fine-tunes produce a LoRA adapter (~50–200 MB) that you can download and run on your own infrastructure with tools like vLLM, Ollama, or llama.cpp.
See LoRA Adapters for download instructions and self-hosting guides.
LoRA adapter downloads are available on all plans including free. Cloud-provider models (OpenAI, Gemini) don't produce downloadable adapters.
When to choose Qwen
- Rapid prototyping — ~5 minute training lets you iterate quickly
- Self-hosting — download the adapter and run on your own GPUs
- Data residency — keep inference on your own infrastructure
- Cost optimization — amortize GPU costs at scale instead of per-request pricing
- Offline / air-gapped — run without internet after downloading
Plan: Free