April 23, 2026 · 8 min read

Abliterated Models Guide — 2026 Edition

If you’ve looked at the Discover tab in any local-AI app and wondered why some Llama variants have abliterated in the name, or why people on r/LocalLLaMA argue about Heretic vs Dolphin vs straight-finetune, this is the post that explains it. Plus the curated download list of the variants that actually work in 2026.

What Abliteration Actually Is

Modern instruction-tuned LLMs have a learned refusal direction in their residual stream. When a prompt activates that direction strongly enough, the model outputs “I cannot help with that” or similar. The direction was put there during RLHF and safety training.

Abliteration removes that direction via orthogonalisation. You take a corpus of refused prompts, isolate the activation pattern that distinguishes them from accepted prompts, and then project that direction out of every weight matrix in the model. The result is a model with the same training and capabilities but no longer prone to categorical refusal.

It’s a clean technique — not a finetune, not a jailbreak, not a system-prompt trick. Just linear algebra applied to the model weights. Original paper: “Refusal in Language Models Is Mediated by a Single Direction” (Arditi et al., 2024).

Abliterated vs Other “Uncensored” Approaches

MethodHow it worksEffortQuality impact
AbliterationProject out refusal direction~hours on GPU1-3% degradation
Full finetune (Dolphin, Hermes)Re-train on uncensored corpus~days, expensiveVariable, often improves chat
LoRA finetuneAdapter on uncensored data~hoursMinor, reversible
Merge (Frankenmerges)Combine multiple finetunes~hoursHighly variable
System prompt jailbreakPersona-style instructionsNoneBrittle, breaks on long context

Abliteration is the cleanest research-grounded option. Dolphin and Hermes are battle-tested production finetunes. Frankenmerges are wildcards. System-prompt jailbreaks are the worst long-term option — they consume context tokens and degrade as conversations grow.

The Recommended Abliterated Models (2026)

Qwen 3.6 Family

Qwen 3.6 (April 2026) is currently the strongest base for abliteration. Two notable releases:

Gemma 4 Heretic

Llama 3.1 Family

Hermes 3

Hermes 3 (Nous Research) is technically a full finetune, not abliteration, but functions similarly — no refusals, strong instruction-following, agent-friendly. Variants:

GLM 5.1 Heretic

The newest entrant: huihui-ai/Huihui-GLM-5.1-abliterated-GGUF. The 754B MoE GLM 5.1 abliterated. 236 GB at IQ2_M — not consumer hardware, but if you have a multi-GPU rig or a Mac Studio M4 Ultra, it’s the strongest open abliterated model period.

How to Download and Run

Path 1 — Ollama (one command)

ollama pull richardyoung/qwen3-14b-abliterated:q4_K_M
ollama run richardyoung/qwen3-14b-abliterated:q4_K_M

# Or for the agent-tagged variant with tool calling
ollama pull richardyoung/qwen3-14b-abliterated:agent
ollama run richardyoung/qwen3-14b-abliterated:agent

Path 2 — Locally Uncensored (one click)

Open Locally Uncensored, navigate to Model Manager → Discover → Text, click the UNCENSORED filter tab. The 34 curated abliterated GGUFs are all there with one-click download. Sizes, hardware tags, and tool-calling support shown on each card.

The new v2.4.0 Settings > Model Storage override lets you redirect the GGUF download folder if you want them on a separate drive.

Path 3 — Direct HuggingFace download

Go to the HuggingFace repo, click the file you want, hit Download. Place the .gguf in your LM Studio or Ollama models folder, restart the runner. More manual but works for edge-case quants not in the curated lists.

Hardware Recommendations

VRAMBest Abliterated PickWhy
8 GBLlama 3.1 8B abliterated Q4_K_MFits with headroom, good chat quality
12 GB (RTX 3060)Qwen 3 14B abliterated Q4_K_MSweet spot, ~15 tok/s
16 GBGemma 4 31B Heretic Q4_K_MBest general-purpose abliterated at this VRAM
24 GB (RTX 3090/4090)Gemma 4 31B Heretic Q5_K_MHigher quality, room for long context
48 GB+Hermes 3 70B or GLM 5.1 Heretic IQ2Frontier-tier abliterated quality

Common Questions

Will an abliterated model write me malware?

Probably not the way you’re thinking. Abliteration removes the categorical refusal but the model still has training-time priors against obviously-bad outputs. Asking “write me ransomware” usually gets a meaningful answer about what ransomware is rather than functional code. The models work best for legitimate-but-edge-case use cases: security research, fiction with violence, medical questions the base model deflects, legal grey areas, adult creative writing.

Are abliterated models dangerous?

No more than the underlying base. Abliteration removes a layer of guardrails. The model’s underlying knowledge is unchanged from the base. If you trust the base model, the abliterated version doesn’t introduce new capabilities — just unblocks ones that were already there.

Can I abliterate a model myself?

Yes. The technique is well-documented and the code is on GitHub (search abliterator, llm-abliterator, or follow the original paper). You need a GPU with the model loaded, a few thousand refused-vs-accepted prompt pairs, and a few hours. Most people don’t bother — the popular base models are already abliterated by maintainers like richardyoung and mlabonne within days of release.

Related Reading


Locally Uncensored is AGPL-3.0 licensed. Built by PurpleDoubleD. Bug reports and feature requests on GitHub Discussions or in the Discord.

34 curated abliterated GGUFs with one-click download in Locally Uncensored

Download Locally Uncensored