Storybook Visual Regression Testing Without the Per-Snapshot Bill
If you maintain a design system, your Storybook already lists every component and
variant your team cares about — which is to say, everything worth screenshot testing.
This guide turns that list into visual regression tests on every PR,
using Storybook's index.json, a Playwright reporter, and a flat-rate plan
instead of a per-snapshot bill.
Why component-level diffs catch what page-level misses
Page-level visual testing tells you a page changed; component-level testing tells you which component changed and in which states. A design token tweak that slightly mis-renders the disabled button doesn't show up on your homepage screenshot — the homepage doesn't render a disabled button. Your Storybook does, in a dedicated story, in isolation, at a stable URL:
https://your-storybook.example.com/iframe.html?id=components-button--disabled&viewMode=story
That URL shape is why Storybook is such an easy target: hundreds of small, isolated captures you can enumerate from one manifest file and run in parallel.
The architecture in one sentence
Captures happen in your CI runner via Playwright; the
@corralimited/snapdiff-playwright reporter uploads each PNG to SnapDiff,
creates one build per test run, and SnapDiff diffs each story against its per-branch
baseline — it never runs your Storybook or your tests. Three moving parts:
- A publicly reachable Storybook deployment (Vercel, Netlify, GitHub Pages — anywhere stable).
- A Playwright test that loops over
index.jsonand snapshots each story. - A GitHub workflow that runs it on every PR and posts a
snapdiff/visual-testcommit status.
Enumerate every story from index.json
Storybook 7+ publishes a manifest at the root of the build. The whole test is a loop over it — new stories are picked up automatically, with no list to maintain:
// tests/storybook/stories.spec.ts
import { test } from '@corralimited/snapdiff-playwright';
const STORYBOOK_URL = process.env.STORYBOOK_URL!;
const res = await fetch(`${STORYBOOK_URL}/index.json`);
const { entries } = await res.json();
for (const entry of Object.values(entries)) {
if (entry.type !== 'story') continue; // skip MDX docs pages
if (entry.tags?.includes('snapdiff-skip')) continue;
test(entry.id, async ({ page, snapshot }) => {
await page.goto(`/iframe.html?id=${entry.id}&viewMode=story`);
await page.waitForFunction(
() => document.querySelector('#storybook-root')?.children.length,
);
await snapshot(entry.id);
});
}
Wire the reporter into playwright.config.ts with your SnapDiff project slug
(it reads SNAPDIFF_API_KEY from the environment), turn on
fullyParallel with ~8 workers, and set retries: 0 — a flaky
capture that retries to green would silently mask a real change. With ~100 stories and
4 cores, a full run is about 30 seconds of wall clock.
tags: ['snapdiff-skip'] in the story definition. Faker data, MSW randomness,
and live dates are the usual suspects.
Run it on every PR
name: Storybook Visual Regression
on:
pull_request:
push:
branches: [main]
jobs:
storybook-visual:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '22' }
- run: npm ci
- run: npx playwright install --with-deps chromium
- run: npx playwright test
env:
SNAPDIFF_API_KEY: ${{ secrets.SNAPDIFF_API_KEY }}
STORYBOOK_URL: https://design-system.acme.com
Each run creates a build in the SnapDiff dashboard. The PR gets a
snapdiff/visual-test status with the count of changed stories; the linked
build page shows each one with before/after and overlay views and per-story
Approve/Reject buttons. Approving promotes those
captures as the new baselines and flips the check green. The push: main
trigger keeps the main baselines current after merges.
When a design token change flips 50 stories
Component-level visual testing is much chattier than page-level — that's the point, but it needs two habits:
- Batch approval. When a token change cascades intentionally, use the build page's "Approve all changed" action instead of clicking through 50 stories.
- Kill nondeterminism at the source. Builds compare with a fixed 0.1% match tolerance, and SnapDiff filters anti-aliasing jitter server-side — so genuine 1-pixel border changes still surface. If stories flap, the cause is almost always random content: disable faker/MSW randomness in production Storybook builds, pin dates, and tag the rest with
snapdiff-skip.
Protected Storybook deployments
Because capture happens in your CI runner, a Storybook behind Vercel Deployment Protection or Cloudflare Access works without giving SnapDiff any access — the reporter inherits whatever Playwright sends:
// playwright.config.ts
use: {
baseURL: STORYBOOK_URL,
extraHTTPHeaders: {
'x-vercel-protection-bypass': process.env.VERCEL_AUTOMATION_BYPASS_SECRET!,
'x-vercel-set-bypass-cookie': 'true',
},
},
The same property handles authenticated app pages generally — anything Playwright can log into, the reporter can snapshot.
The pricing math
The usual complaint about Storybook visual testing services is the meter: the better your story coverage, the more every PR costs. SnapDiff plans are flat monthly rates, so the math is just picking a plan. One diff per story per run:
| Design system | Monthly volume | Plan that covers it |
|---|---|---|
| 100 stories, ~40 PR runs/mo | ~4,000 diffs | Pro — $59/mo (5,000 diffs) |
| 300 stories, ~50 PR runs/mo | ~15,000 diffs | Team — $119/mo (20,000 diffs) |
| 500+ stories, agency volume | 50,000+ diffs | Scale — $299/mo (100,000 diffs) |
The point being: adding a story shouldn't be a billing decision. Overage rates and screenshot quotas are on the pricing page.
Frequently asked questions
Is SnapDiff a Chromatic alternative?
For visual regression testing of Storybook stories, yes — with two structural differences. SnapDiff doesn't host or publish your Storybook (bring your own deployment), and pricing is flat-rate per month rather than per snapshot, so testing a large design system on every PR doesn't produce a usage-based bill.
How do I capture every story automatically?
Storybook 7+ publishes an index.json manifest of every story ID. A short Playwright test fetches it, loops over the entries, navigates to each story's iframe.html?id=...&viewMode=story URL, and calls snapshot(). New stories are picked up automatically.
How many diffs does a design system use per month?
One diff per story per CI run. A 100-story Storybook with 40 PR runs a month is ~4,000 diffs — within the Pro plan ($59/mo, 5,000 diffs). A 500-story system at similar volume lands on Team or Scale. Flat-rate plans make this a plan choice, not a metered bill.
How do I skip flaky or nondeterministic stories?
Tag them with snapdiff-skip and filter on that tag in the iteration loop. For stories that flap because of faker/MSW randomness or live dates, the better fix is making production Storybook builds deterministic — that pays off beyond visual testing.
Can I test hover, focus, and other interaction states?
Yes. Captures run through Playwright in your CI runner, so any interaction Playwright can perform can happen before snapshot(). The cleanest pattern is modeling interaction states as dedicated stories so they're enumerated in index.json like everything else.
Does it work if my Storybook is behind authentication?
Yes. The reporter inherits Playwright's configuration, so protected deployments work via extraHTTPHeaders — Vercel's bypass header, Cloudflare Access credentials, or basic auth. SnapDiff's servers never need access to the deployment; they only receive the PNGs.
Point it at your Storybook
The free plan's 200 diffs a month will baseline a small component library and catch its first regression.