Next Generation Announced

GPT 3.2

Built on GPT 3.1. Same architecture, more training, better William.

163M

Parameters

2048

Token context

11B

Training tokens

53k

Conversations

RWKV

Architecture

About this model

GPT 3.2 is the direct successor to GPT 3.1 — same RWKV-like architecture, same parameter count, but trained significantly harder. The pre-training corpus grows from 10B to 11B tokens, and the fine-tuning phase is both longer and more intensive than 3.1's.

The key addition is advanced patching fine-tuning — a new targeted training stage specifically designed around William's patterns. Rather than just running more fine-tuning epochs, patching identifies the specific places where the model drifts from his style and corrects them with precision.

The goals are straightforward: better language processing and understanding overall, and more accurate personality replication. GPT 3.1 is coherent. GPT 3.2 should be coherent and actually sound like William — closing the gap that's been there since GRU 1.

No release date has been set. We'll announce when it's ready.

Try GPT 3.1 in the meantime Compare all models →

What changes from GPT 3.1

11B Training Tokens +10%

Expanded from 10B to 11B tokens of English and Swedish text — a broader foundation before fine-tuning begins.

Advanced Patching Fine-tuning

A new targeted training stage that identifies exactly where the model drifts from William's style and corrects it directly — not just more epochs of the same thing.

More & Longer Fine-tuning

More fine-tuning passes on the full 53,000 conversations with William. The model spends more time internalising his patterns before deployment.

Same Architecture, Same Scale

RWKV-like architecture, 163M parameters, 2048 token context — identical to GPT 3.1. The improvement comes from training quality, not from scaling up.

Where it sits in the lineup

GRU 1 · Legacy

15M

parameters

256 token context

2,700 conversations

GRU architecture

GRU 2 · Stable

~40M

parameters

512 token context

9,000 conversations

GRU architecture

GPT 3.1 · Live now

163M

parameters

2048 token context

53,000 conversations

10B training tokens

GPT 3.2 · You are here

163M

parameters

2048 token context

53,000 conversations

11B training tokens

Advanced patching FT

A note

GPT 3.2 has not been released and no date has been set. It requires new training runs that take significant time. Until then, GPT 3.1 is the recommended model for anyone who wants a real conversation.