Mostly LLMs, mostly.

Notes on LLMs, mostly ramblings.

Browse notes Tools

8 notes so far.

Blog

Latest Posts

Most of the site lives in the blog archive.

Browse Blog

NoteMay 12, 2026

Flex Inference: 50% Off LLM Calls on Gemini, OpenAI, and Bedrock

Every major AI provider now offers half-price inference if you can tolerate a few extra seconds of latency. One parameter change. Same API. Here's how it works and why.

Lab NoteMay 8, 2026

A Better Way to Clone Screenshots to HTML

Using Gemini's bounding box detection to get precise measurements when converting a screenshot to code. Plus how prompt caching and flex inference make the multi-pass approach surprisingly cheap.

NoteJan 23, 2026