Commissioned
ModelsProviders

Qwen 3 8B

Fine-tune the open-source Qwen 3 8B model with QLoRA on Commissioned's GPUs.

Qwen 3 8B runs on Commissioned's own GPU infrastructure using QLoRA (Quantized Low-Rank Adaptation). It's fundamentally different from the cloud providers — training is fast, and you get a portable adapter you can self-host.

How it differs

Cloud models (OpenAI/Gemini)Qwen 3 8B
Training time30–45 minutes~5 minutes
OutputHosted modelHosted model + downloadable LoRA adapter
Self-hostingNot possibleYes — download and run anywhere
Model sizeLarge (proprietary)8B parameters (open-source)
QualityHigher ceilingGood for focused tasks

LoRA adapters

Qwen fine-tunes produce a LoRA adapter (~50–200 MB) that you can download and run on your own infrastructure with tools like vLLM, Ollama, or llama.cpp.

See LoRA Adapters for download instructions and self-hosting guides.

LoRA adapter downloads are available on all plans including free. Cloud-provider models (OpenAI, Gemini) don't produce downloadable adapters.

When to choose Qwen

  • Rapid prototyping — ~5 minute training lets you iterate quickly
  • Self-hosting — download the adapter and run on your own GPUs
  • Data residency — keep inference on your own infrastructure
  • Cost optimization — amortize GPU costs at scale instead of per-request pricing
  • Offline / air-gapped — run without internet after downloading

Plan: Free

On this page