Guides

Build a Code Assistant

Fine-tune a model on your codebase, conventions, and internal APIs.

This guide walks through creating a fine-tuned model that understands your codebase, follows your team's conventions, and knows your internal APIs.

What you'll need

Source code — your actual codebase (or key parts of it)
Documentation — READMEs, API docs, architecture docs, style guides
Code review comments (optional) — examples of feedback your team gives

Step by step

Select your training data

Don't upload your entire monorepo. Be selective:

High-value data:

Core modules and libraries your team wrote
Internal API documentation and OpenAPI specs
Style guides and coding conventions
Architecture decision records (ADRs)
Well-written code with good comments
Code review discussions (from PRs)

Skip:

Generated code (build artifacts, lock files)
Third-party dependencies
Test fixtures and mock data (unless you want the model to write tests)
Binary files

Prepare the files

Concatenate related files into logical groups and save as .txt or .md:

# Example: bundle your API layer
find src/api -name "*.ts" -exec cat {} + > api-layer.txt

# Example: bundle your core library
find packages/core/src -name "*.ts" -exec cat {} + > core-lib.txt

Also include documentation separately:

README.md files
API documentation (.md or .txt)
Style guide (if you have one)

Remove any secrets, API keys, credentials, or connection strings before uploading.

Create the fine-tune

Upload your files and describe the use case:

"Create a coding assistant specialized in our [language/framework] codebase. It should follow our coding conventions: [list key conventions, e.g., 'we use TypeScript with strict mode, prefer functional components, use Tailwind for styling']. It knows our internal API at [briefly describe your API structure]. When writing code, it should match the patterns in the training data."

Recommended base model: GPT-4.1 (best code understanding and generation quality)

Test with real tasks

Try the kinds of tasks your team does daily:

"Write a new API endpoint for [feature] following our conventions"
"Refactor this function to match our style guide: [paste code]"
"How does our authentication middleware work?"
"Write a unit test for the UserService.createUser method"
"What's the right way to add a new database migration in our project?"

The model should produce code that looks like it was written by your team.

Integration ideas

IDE extension / CLI tool

Call the API from a script or custom extension:

def ask_code_assistant(question: str, code_context: str = "") -> str:
    messages = [{"role": "user", "content": question}]
    if code_context:
        messages.insert(0, {
            "role": "system",
            "content": f"Here's the relevant code context:\n\n{code_context}"
        })

    response = client.chat.completions.create(
        model="your-code-model-id",
        messages=messages,
    )
    return response.choices[0].message.content

Code review bot

Feed PR diffs to your model for automated review suggestions:

def review_code(diff: str) -> str:
    return ask_code_assistant(
        f"Review this code change and suggest improvements "
        f"based on our team's conventions:\n\n{diff}"
    )

Tips

Update regularly — as your codebase evolves, retrain with current code
Include examples of good and bad code — if you have style guide violations and corrections, include both
Don't include everything — a focused model trained on well-written code is better than one trained on every file in the repo

Build a Support Agent

Create a customer support agent trained on your help documentation and past tickets.

Self-Host with LoRA

Download your LoRA adapter and run your fine-tuned model on your own infrastructure.

Build a Code Assistant

On this page