testing and automating accessibility

Setting Lighthouse CI Accessibility Budgets

A passing Lighthouse audit today does nothing to stop a contrast regression from shipping tomorrow. The only durable way to hold an accessibility baseline is to make a drop below it fail the build. Lighthouse CI (@lhci/cli) does exactly that: it runs Lighthouse in your pipeline, asserts your accessibility score and individual audits against declared thresholds, and exits non-zero when they slip—failing the pull request before the regression merges. This guide configures lighthouserc assertions and budgets, wires lhci autorun into GitHub Actions, fails the build on regression, and shows how to allowlist known debt without disabling the gate. It is the enforcement half of the Accessibility Audits with Lighthouse.

Mapped WCAG 2.1/2.2 Success Criteria:

  • 1.4.3 Contrast (Minimum) – The color-contrast audit, commonly promoted to a hard error in CI.
  • 4.1.2 Name, Role, Value – Accessible-name audits (button-name, link-name) gated as errors.
  • 1.1.1 Non-text Content – The image-alt audit enforced on every build.
  • 1.3.1 Info and Relationships – Structural audits (heading-order, list) asserted to prevent drift.

Prerequisites

You need a buildable app that serves a real route, Node 18+, and the Lighthouse CI package. Install it as a dev dependency so the pinned version is reproducible across local runs and CI:

npm install --save-dev @lhci/cli
# Sanity-check the binary
npx lhci --version

Decide your threshold before writing config. A common pattern is to assert the current score as the floor (so you never regress) rather than aiming for an aspirational number you don't yet meet—then ratchet the floor up as you fix debt.


The lighthouserc Config: assert.assertions

Lighthouse CI's enforcement lives in assert.assertions. Each entry maps an audit ID—or the special categories:accessibility key—to a severity (off, warn, error) and optional options like minScore. An error assertion that fails causes a non-zero exit, which fails the build.

// lighthouserc.js — accessibility budget configuration
module.exports = {
  ci: {
    collect: {
      // Boot the app, then tear it down when collection finishes
      startServerCommand: 'npm run start',
      startServerReadyPattern: 'ready on|listening on|started server',
      url: ['http://localhost:3000/', 'http://localhost:3000/dashboard'],
      numberOfRuns: 3, // median across runs damps flaky timing/contrast results
    },
    assert: {
      assertions: {
        // CATEGORY GATE: fail if the weighted accessibility score drops below 0.95
        'categories:accessibility': ['error', { minScore: 0.95 }],

        // PER-AUDIT GATES: promote high-value audits to hard failures.
        // These fail the build even if the overall score stays above 0.95.
        'color-contrast': 'error',
        'button-name': 'error',
        'link-name': 'error',
        'image-alt': 'error',
        'label': 'error',
        'html-has-lang': 'error',
        'document-title': 'error',

        // Structural audits as warnings first; promote to error once clean
        'heading-order': 'warn',
        'list': 'warn',
      },
    },
    upload: {
      // Attach a public report URL to each run for reviewer triage
      target: 'temporary-public-storage',
    },
  },
};

The two assertion styles work together. categories:accessibility with minScore is your budget—a floor on the aggregate weighted score. The per-audit error entries are non-negotiables—specific defects you refuse to ship even if the overall number stays healthy. Pairing them means a regression can't sneak through by failing a heavy audit while other audits happen to improve and keep the average up. For why the aggregate alone is insufficient, see interpreting Lighthouse accessibility scores.

Testing Hook: Set minScore to your current score, not a future goal. Asserting a floor you already clear means the gate catches regressions immediately, and you raise the floor deliberately as debt is paid down.


A JSON Alternative

If your team prefers JSON over a JS module, lighthouserc.json is equivalent and lhci picks it up automatically:

{
  "ci": {
    "collect": {
      "startServerCommand": "npm run start",
      "url": ["http://localhost:3000/"],
      "numberOfRuns": 3
    },
    "assert": {
      "assertions": {
        "categories:accessibility": ["error", { "minScore": 0.95 }],
        "color-contrast": "error",
        "button-name": "error",
        "image-alt": "error"
      }
    },
    "upload": { "target": "temporary-public-storage" }
  }
}

Testing Hook: Use the JS config (lighthouserc.js) when you need comments, environment-conditional thresholds, or computed URL lists; use JSON for static, lint-friendly configs that other tooling reads.


Running lhci autorun in GitHub Actions

lhci autorun chains collect → assert → upload in one command, reading your lighthouserc. The treosh/lighthouse-ci-action wraps it for GitHub Actions, surfaces the assertion results, and can attach artifacts to the run.

# .github/workflows/lighthouse-budget.yml
name: Lighthouse Accessibility Budget
on:
  pull_request:
    branches: [main]
jobs:
  a11y-budget:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - run: npm ci
      - run: npm run build   # produce the production build lhci will serve

      - name: Run Lighthouse CI and assert the accessibility budget
        uses: treosh/lighthouse-ci-action@v12
        with:
          configPath: ./lighthouserc.js
          # The action exits non-zero when an `error` assertion fails,
          # which fails the job and blocks the PR.
          uploadArtifacts: true        # attach HTML reports to the workflow run
          temporaryPublicStorage: true # produce a shareable report URL per run

Because the action propagates lhci's exit code, a failed error assertion fails the job automatically—no extra scripting required. To make the gate mandatory, mark this job as a required status check in your branch protection rules, the same way you would for any other CI/CD accessibility gate.

Testing Hook: Confirm the job is listed under branch protection's required checks. An assertion that fails the job but isn't required will let a regressing PR merge anyway.


Failing the Build on Regression

The gate is only as good as its determinism. Two settings keep it from flaking and either blocking clean PRs or waving through real regressions:

  • numberOfRuns: 3 (or more) with the median makes timing- and layout-sensitive audits stable. Accessibility audits are largely deterministic, but emulated viewports and hover-dependent contrast can wobble on a single run.
  • Audit the production build, served by startServerCommand, not the dev server. Dev builds inject extra markup and warning UI that can change audit results.

When an error assertion fails, the action's log prints the exact assertion, the expected threshold, and the actual value—so a reviewer sees "color-contrast failed: found 3 nodes" rather than a bare red X. Pair that with the uploaded report URL and the regression is diagnosable from the PR alone.

# Reproduce the CI gate locally before pushing
npm run build
npx lhci autorun --config=./lighthouserc.js
echo "Exit code: $?"   # non-zero means an error assertion failed

Testing Hook: Run npx lhci autorun locally and read the exit code. A non-zero exit is precisely what fails CI; reproducing it locally turns a red pipeline into a fast feedback loop.


Per-Audit Allowlist for Known Debt

Real codebases carry accessibility debt you can't fix in the same PR that adds the gate. The wrong move is to lower the whole budget or disable the gate; the right move is a scoped exception that keeps every other check strict.

Use off (or warn) on the specific audit you're temporarily tolerating, ideally with a comment and a tracking ticket. This is an allowlist, not a surrender—every other audit stays at error.

// lighthouserc.js — scoped allowlist for tracked debt
module.exports = {
  ci: {
    collect: { url: ['http://localhost:3000/'], numberOfRuns: 3 },
    assert: {
      // Inherit a strict preset, then override only the debt items
      preset: 'lighthouse:recommended',
      assertions: {
        // Keep the accessibility budget strict
        'categories:accessibility': ['error', { minScore: 0.95 }],
        'color-contrast': 'error',
        'button-name': 'error',

        // KNOWN DEBT — downgrade, don't delete. Re-promote when fixed.
        // JIRA A11Y-482: legacy date-picker fires aria-required-children.
        'aria-required-children': 'warn',

        // ALLOWED FAILURE — third-party embed we don't control yet.
        // JIRA A11Y-510: vendor widget lacks landmark; remove when vendor ships fix.
        'landmark-one-main': 'off',
      },
    },
  },
};

For audits where some failing nodes are tolerated but the count must not grow, use minScore/maxLength-style options on the specific assertion rather than turning it off entirely, so new instances of the same defect still fail the build.

Testing Hook: Tie every allowlist entry to a ticket ID in a comment. Audit the config in code review: an off without a tracking reference is invisible debt that will quietly become permanent.


How to Verify

Verify the gate works with both a tool check and a manual deliberate-regression test:

Tool check. Run the gate locally against your current build and confirm it passes, then read the assertion results to confirm the thresholds are what you intend:

npm run build && npx lhci autorun --config=./lighthouserc.js
# Exit 0 = all assertions met. Inspect the uploaded report URL printed in the log.

Manual regression test. Prove the gate actually blocks a regression—a gate you've never seen fail is a gate you can't trust. Introduce a known violation on a branch (drop the alt on an image, or set body text to a contrast-failing color), push the PR, and confirm the GitHub Actions job turns red with the image-alt or color-contrast assertion named in the log. Revert once you've seen it fail. This single test validates the entire chain: collect, assert, exit code, and branch protection.

For an additional manual pass on the things this score can't see (keyboard, focus order, screen-reader meaning), pair the gate with the manual checklist in interpreting Lighthouse accessibility scores.


Common a11y Mistakes

  1. Asserting only categories:accessibility minScore. The aggregate can hide a heavy audit failing while others improve. Add per-audit error assertions for non-negotiables.
  2. Setting an aspirational minScore you don't yet meet. The build is red from day one and the team disables the gate. Floor at the current score, then ratchet up.
  3. Auditing the dev server. Dev-only markup and warnings skew results. Always serve the production build via startServerCommand.
  4. Single-run audits. Timing-sensitive checks flake. Use numberOfRuns: 3 with the median.
  5. Disabling the whole gate for one piece of debt. Use a scoped off/warn on the specific audit with a tracking ticket, never a global downgrade.
  6. Forgetting branch protection. A failing job that isn't a required check lets the PR merge anyway.

Conclusion

A Lighthouse CI accessibility budget converts a one-time audit into a standing guarantee: scores can't quietly regress because a drop below your floor or a failure of a non-negotiable audit fails the pull request. Pair a categories:accessibility minScore with per-audit error assertions, run it through lhci autorun in GitHub Actions as a required check, median multiple runs against the production build, and scope known debt with tracked allowlist entries. Then prove it by watching a deliberate regression turn the build red.


Frequently Asked Questions

How do I make Lighthouse CI fail my build when the accessibility score drops? Add a categories:accessibility assertion with severity error and a minScore floor in your lighthouserc config, then run it via lhci autorun. When the measured score falls below minScore, lhci exits non-zero, which fails the CI job. With the GitHub Actions job marked as a required status check, that failure blocks the pull request from merging.

Should I assert the category score or individual audits? Both. The categories:accessibility minScore is your aggregate budget, but it can mask a specific heavy audit failing if other audits improve. Promote your non-negotiable audits—color-contrast, button-name, image-alt, label—to error assertions so those defects fail the build regardless of where the overall score lands.

How do I handle known accessibility debt without disabling the whole gate? Use a scoped allowlist: set the specific failing audit to warn or off in assert.assertions, leave every other audit at error, and add a comment with a tracking ticket. This keeps the gate strict everywhere except the one tolerated item, and the ticket reference makes the debt visible in code review so it gets re-promoted to error once fixed.

Why does my Lighthouse CI score fluctuate between runs? Some audits are sensitive to timing, viewport emulation, or hover-dependent contrast, so a single run can vary. Set numberOfRuns to 3 or more and Lighthouse CI uses the median, which is far more stable. Also audit the production build via startServerCommand rather than the dev server, since dev-only markup can change results between runs.

What's the difference between Lighthouse CI assertions and a performance budget file? Assertions in assert.assertions cover any audit or category, including accessibility, with severities and minScore. A budgets.json performance budget targets resource sizes and timings specifically. For accessibility gating you use assertions—categories:accessibility minScore plus per-audit error entries—not the performance budget file.