GPT 3.2
Built on GPT 3.1. Same architecture, more training, better William.
About this model
GPT 3.2 is the direct successor to GPT 3.1 — same RWKV-like architecture, same parameter count, but trained significantly harder. The pre-training corpus grows from 10B to 11B tokens, and the fine-tuning phase is both longer and more intensive than 3.1's.
The key addition is advanced patching fine-tuning — a new targeted training stage specifically designed around William's patterns. Rather than just running more fine-tuning epochs, patching identifies the specific places where the model drifts from his style and corrects them with precision.
The goals are straightforward: better language processing and understanding overall, and more accurate personality replication. GPT 3.1 is coherent. GPT 3.2 should be coherent and actually sound like William — closing the gap that's been there since GRU 1.
What changes from GPT 3.1
11B Training Tokens +10%
Expanded from 10B to 11B tokens of English and Swedish text — a broader foundation before fine-tuning begins.
Advanced Patching Fine-tuning
A new targeted training stage that identifies exactly where the model drifts from William's style and corrects it directly — not just more epochs of the same thing.
More & Longer Fine-tuning
More fine-tuning passes on the full 53,000 conversations with William. The model spends more time internalising his patterns before deployment.
Same Architecture, Same Scale
RWKV-like architecture, 163M parameters, 2048 token context — identical to GPT 3.1. The improvement comes from training quality, not from scaling up.
Where it sits in the lineup
GRU 1 · Legacy
15M
parameters
256 token context
2,700 conversations
GRU architecture
GRU 2 · Stable
~40M
parameters
512 token context
9,000 conversations
GRU architecture
GPT 3.1 · Live now
163M
parameters
2048 token context
53,000 conversations
10B training tokens
GPT 3.2 · You are here
163M
parameters
2048 token context
53,000 conversations
11B training tokens
Advanced patching FT
A note
GPT 3.2 has not been released and no date has been set. It requires new training runs that take significant time. Until then, GPT 3.1 is the recommended model for anyone who wants a real conversation.