ModelsProviders

Qwen 3 8B

Fine-tune the open-source Qwen 3 8B model with QLoRA on Commissioned's GPUs.

Qwen 3 8B runs on Commissioned's own GPU infrastructure using QLoRA (Quantized Low-Rank Adaptation). It's fundamentally different from the cloud providers — training is fast, and you get a portable adapter you can self-host.

How it differs

	Cloud models (OpenAI/Gemini)	Qwen 3 8B
Training time	30–45 minutes	~5 minutes
Output	Hosted model	Hosted model + downloadable LoRA adapter
Self-hosting	Not possible	Yes — download and run anywhere
Model size	Large (proprietary)	8B parameters (open-source)
Quality	Higher ceiling	Good for focused tasks

LoRA adapters

Qwen fine-tunes produce a LoRA adapter (~50–200 MB) that you can download and run on your own infrastructure with tools like vLLM, Ollama, or llama.cpp.

See LoRA Adapters for download instructions and self-hosting guides.

LoRA adapter downloads are available on all plans including free. Cloud-provider models (OpenAI, Gemini) don't produce downloadable adapters.

When to choose Qwen

Rapid prototyping — ~5 minute training lets you iterate quickly
Self-hosting — download the adapter and run on your own GPUs
Data residency — keep inference on your own infrastructure
Cost optimization — amortize GPU costs at scale instead of per-request pricing
Offline / air-gapped — run without internet after downloading

Plan: Free

Google Gemini

Fine-tune Gemini 2.5 Flash, Flash Lite, and Pro on Commissioned.

How Fine-Tuning Works

The end-to-end pipeline from data upload to deployed model.

Qwen 3 8B

How it differs

LoRA adapters

When to choose Qwen

On this page