Note_Tech

All technological notes.

Project maintained by simonangel-fong Hosted on GitHub Pages — Theme by mattgraham

AI Fundamental - Fine-Tuning

AI Fundamental - Fine-Tuning

Fine-Tuning

Fine-tuning
- the process of taking a pre-trained model (like GPT or LLaMA) and training it further on a specific dataset.
- using proprietary or domain specific data to improve output quality and domain relevant results
Base model: General knowledge
Fine-tuned model: Specialized behavior

The Analogy:

Pretrained LLM: A smart generalist who has read the entire internet.

Fine-tuning: Sending that generalist to a specific job-training program to learn your company’s unique workflows.

When Do You Use Fine-Tuning?

Use fine-tuning when prompting (Zero-shot or Few-shot) alone is not enough to achieve the desired consistency or specialized behavior.

Common Use Cases

Domain-specific knowledge: Medical, legal, finance, or internal company policies.
Style & Tone Control: Customer support tone, brand voice (formal, friendly, etc.).
Structured Outputs: Consistently generating JSON, API responses, or SQL queries.
Classification Tasks: Sentiment analysis, ticket routing, or fraud detection.
Task Specialization: Code generation for specific frameworks or Log Analysis (a high-value DevOps use case).

Common Techniques

There are three main levels of fine-tuning used in the industry today:

Full Fine-Tuning

Training all model parameters.

Pros: Highest potential performance.
Cons: Extremely expensive (GPU/Compute), time-consuming, and prone to “catastrophic forgetting.”
Status: Not common in practice for most enterprises.

Parameter-Efficient Fine-Tuning (PEFT)

The industry standard for most applications.

Examples: LoRA (Low-Rank Adaptation), QLoRA, Adapters.
Concept: Freeze 99% of the model weights and only train small “adapter” layers.
Pros: Cheap, fast, requires much less VRAM, and highly portable.

Instruction Fine-Tuning

Training the model specifically on an Instruction -> Response format.

Example:
- Instruction: Summarize this log.
- Input: <log data>
- Output: <summary>
Goal: Teaches the model how to follow specific human commands.

Simple Example

Goal: Train a model to analyze CI/CD failures automatically.

Step 1: Prepare Dataset

Format your data as input → output pairs:

{
  "input": "ERROR: Docker build failed COPY requirements.txt not found",
  "output": "Root Cause: requirements.txt missing from build context. Fix: ensure file exists or correct path."
}

Requirement: 100–10,000+ high-quality examples.

Step 2: Choose Base Model

Examples:

Open models: LLaMA, Mistral
API-based: OpenAI models (fine-tuning supported)

Step 3: Train (Fine-Tune)

Using LoRA (typical modern approach):

Load base model
Freeze weights
Train small adapters on your dataset

Step 4: Evaluate

Test with unseen logs:

Input:

ERROR: npm install failed package.json missing

Expected output:

Root Cause: package.json missing
Fix: add package.json or correct working directory

Step 5: Deploy

Use it in your pipeline:

CI/CD fails →
Send logs →
Model returns:
- Root cause
- Fix suggestion

Fine-Tuning vs RAG

Method	Purpose
Fine-tuning	Change model behavior/style
RAG (Retrieval-Augmented Generation)	Inject external knowledge

Rule of thumb:

Use RAG → for knowledge (policies, docs)
Use Fine-tuning → for behavior (how to respond)

Avoid fine-tuning

just need better prompts → use prompt engineering
Knowledge changes frequently → use RAG
Small dataset → may overfit