The 8purple blog

Field notes on AI & infrastructure

Practical writing on applied AI, GPU compute, platforms and the craft of shipping software that survives contact with production.

Applied AI ·3 min read

Building RAG Systems That Don't Hallucinate

Retrieval-augmented generation is easy to demo and hard to trust. Here is what separates a toy from a system you can put in front of customers.

May 28, 2026 Read

Applied AI ·3 min read

LLM Agents Beyond the Demo: What Production Actually Looks Like

Agent demos run a perfect path once. Production agents face the other 200 paths. Here is how to design for the ones the demo never showed.

May 22, 2026 Read

Data & Infrastructure ·3 min read

Vector Databases, Explained Without the Hype

Do you need a dedicated vector database, or is your existing one enough? A practical look at what these systems actually do and when they earn their keep.

May 15, 2026 Read

Applied AI ·3 min read

Fine-Tuning vs Prompting: Choosing the Cheaper Path

Fine-tuning feels like the serious option. Most of the time it is the expensive answer to a question prompting already solved. Here is how to tell them apart.

May 8, 2026 Read

Applied AI ·3 min read

How to Actually Evaluate an LLM Feature

You cannot ship what you cannot measure. Evaluating generative systems is harder than traditional software testing — and skipping it is how good demos become bad products.

Apr 30, 2026 Read

Data & Infrastructure ·3 min read

The Real Economics of Running GPUs

GPU sticker prices get the headlines, but the bill that matters is utilisation, power, and idle time. A field guide to what AI compute really costs.

Apr 21, 2026 Read

AI Trends ·3 min read

Small Language Models and the Quiet Shift to the Edge

The race for ever-larger models grabbed the headlines. The more consequential trend may be the opposite: small models good enough to run on a phone.

Apr 12, 2026 Read

Data & Infrastructure ·3 min read

MLOps Foundations: From Notebook to Reliable Service

A model that works in a notebook is a science project. A model that serves real traffic reliably is an engineering system. Bridging the two is what MLOps is for.

Apr 3, 2026 Read

Security ·3 min read

Prompt Injection and the New AI Attack Surface

When your application takes instructions in plain language, attackers can write instructions too. Prompt injection is the vulnerability class that traditional security never prepared us for.

Mar 25, 2026 Read

Data & Infrastructure ·3 min read

Model Quantization: Smaller, Faster, Almost as Good

Quantization shrinks a model by storing its numbers with less precision. Done well, it cuts memory and cost dramatically while barely touching quality. Here is the intuition and the tradeoffs.

Mar 14, 2026 Read