PWA Support for the Aleph Site — Design

PWA Support for the Aleph Site — Design

Date: 2026-06-25 Status: Approved (design), pending implementation plan Author: Sophie + Claude

Context

The Aleph site is a Jekyll static site hosted on GitHub Pages (also deployed via Vercel) on a custom domain over HTTPS. It already has most PWA building blocks:

The one missing piece for a real installable PWA is a service worker. Without it, browsers won’t treat the site as installable and there’s no offline support.

The site is auth-bearing: it has a login/ section, verify-email.html, a _protected/ area, and a shared-JWT login API. In production the login frontend calls a separate origin (https://auth.sophiebi.com), but vercel.json also routes same-origin /api/* and /health to the login dispatcher (vercel.json:16-17) — so auth-bearing endpoints exist on the primary origin too. The central design constraint is therefore: never cache or serve stale auth/sensitive content, and the service worker must fail closed (cache only content proven safe) rather than fail open (cache everything not blocklisted).

Goal

Deliver, from a single hand-written service worker:

  1. Installability — “Add to Home Screen”, standalone launch.
  2. Offline access — static assets and a fallback page work with no network.
  3. Faster repeat loads — cached static assets served instantly.

…without ever caching, storing, or serving-stale any auth/sensitive request.

Non-Goals

Components

Six changes.

1. sw.js (new, repo root → served at /sw.js)

Plain static JavaScript (not Jekyll-templated). Located at root so its scope covers the whole origin (/). Approx 80 lines.

Cache name: aleph-v1 — a single versioned constant. Bumping the version invalidates all precached assets.

install event:

Note: precache addAll is atomic — if any URL 404s, install fails. The list above must be kept in sync with files that actually exist at those paths.

/ (home page) is intentionally NOT precached: doing so couples install success to the home page always being a public 200. At runtime / is served via network-passthrough, with /offline.html as the offline fallback.

activate event:

Fail-closed caching model. The ONLY things ever written to the cache are (a) the precache list at install — which includes exactly one HTML file, the static /offline.html shell — and (b) static assets matched by the cache-first allowlist below. No navigation/content HTML is ever cached at runtime — there is no “cache everything not blocklisted” path. This is an allowlist, not a blocklist, so a current or future same-origin HTML route (sensitive or not) cannot become offline/stale by default. Consequence: full offline HTML browsing is not provided (already a non-goal); the only offline navigation response is the precached /offline.html.

Shared cache-write guard (isCacheable(request, response)) — applied on the cache-first path’s write (since that is the only path that writes at runtime). A response is only stored when ALL of:

fetch event — routing logic, evaluated in this order:

  1. Bypass entirely (do NOT call event.respondWith; request goes straight to network and is never read or stored). Bypass when ANY of:
    • The request method is not GET.
    • The request is cross-origin (url.origin !== self.location.origin) — this covers the production login/JWT API on auth.sophiebi.com.
    • The same-origin path starts with one of: /login, /verify-email, /_protected, /now, /journal, /theater, /api, /health. /theater has admin auth/mutation controls (theater/index.html:240,253,483); /api and /health cover the same-origin dispatcher routes in vercel.json:16-17.
  2. Cache-first — for same-origin static assets whose path starts with one of: /css/, /js/, /fonts/, /img/, /assets/, or is one of the root icon / favicon files. On cache hit, return cached response and refresh the cache entry in the background (stale-while-revalidate; the revalidation write also goes through isCacheable). On miss, fetch, write via isCacheable, and return it. This is the only runtime cache-write path.
  3. Network-passthrough with offline fallback — everything else (primarily HTML navigations). No navigation/content HTML is cached:
    • Try the network and return the response as-is (no cache write).
    • On network failure, if the request is a navigation (request.mode === 'navigate'), return the precached /offline.html; otherwise let the failure propagate.

2. offline.html (new, repo root)

A minimal Jekyll page (front matter present so Jekyll renders it; layout: null or a light layout to avoid pulling heavy dependencies). On-brand “You’re offline” message with a short note that the page will work again once back online. Served only by the SW’s offline fallback for failed navigations.

3. site.webmanifest (edit)

Add to the existing JSON:

Keep existing name, short_name, icons, theme_color, background_color, display.

4. _includes/head.html (edit)

Add <meta name="theme-color" content="#ffffff"> near the existing icon/manifest links (value matches the manifest theme_color). Add a single deferred script tag that loads the registration script (see below).

Layout coverage (verified). head.html is included via _layouts/default.html:3, which is the base layout reached by page, post, home, default_nofooter, etc. So login/index.html, verify-email.html, and theater/index.html (all layout: page) already inherit the manifest link, theme-color meta, and registration tag — they must NOT get them added directly (that would duplicate the tags / inject a nested head). The only genuinely standalone page is now/index.html (its own <!DOCTYPE>/<head>).

now/index.html (standalone, edit). This page currently has no manifest link and no SW registration, so a user landing directly on /now/ before visiting any layout-based page gets no install affordance. To keep installability genuinely site-wide, add to its <head> the same three things head.html provides: the <link rel="manifest" href="/site.webmanifest">, the <meta name="theme-color" content="#ffffff">, and the deferred register-sw.js script tag. This only affects installability/registration — /now stays in the SW bypass list, so the page itself is still never cached. (Registering the worker from a bypassed page registers it for the origin without caching that page.)

Net effect: registration markup lives in exactly two places — head.html (all layout-based pages) and now/index.html (the one standalone page).

5. js/register-sw.js (new) + registration

Small script, loaded defer from head.html (and from now/index.html, the one standalone page):

if ('serviceWorker' in navigator) {
  window.addEventListener('load', function () {
    navigator.serviceWorker
      .register('/sw.js', { scope: '/', updateViaCache: 'none' })
      .catch(function () { /* no-op */ });
  });
}

Registers after window load to avoid contending with first-paint resources, guarded by feature detection. updateViaCache: 'none' forces the browser to revalidate sw.js against the network on update checks instead of serving it from the HTTP cache — without this, a stale sw.js could defeat the cache-versioning scheme.

6. vercel.json (edit) — sw.js cache headers

Add a headers rule so /sw.js is served Cache-Control: no-cache (revalidate every time). This ensures a freshly-deployed sw.js is picked up promptly on the Vercel-served origin. On GitHub Pages the effective update latency is the platform default (~10-min asset cache) combined with the browser’s built-in 24h SW-script update rule; updateViaCache: 'none' in registration mitigates the HTTP-cache leg on both hosts.

Data Flow

Browser request
      │
      ▼
 sw.js fetch handler
      │
      ├─ non-GET / cross-origin / sensitive path
      │   (/login /verify-email /_protected /now /journal /theater /api /health)?
      │        └─► bypass (network only, never read or stored)
      │
      ├─ static asset path (/css /js /fonts /img /assets /icons)?
      │        └─► cache-first (return cache, revalidate in bg via isCacheable)
      │                                          ↑ ONLY runtime cache-write path
      │
      └─ otherwise (HTML nav) ──► network-passthrough (NEVER cached)
                                     ├─ ok ──► return as-is
                                     ├─ fail + navigate ──► /offline.html
                                     └─ fail + other ──► propagate error

isCacheable = GET, no Range, status 200, type 'basic',
              Cache-Control without no-store/private   (gate on the cache-first write)
Caches ever hold only: precache list (incl. /offline.html) + cache-first static
assets. No navigation/content HTML is ever cached at runtime.

Error Handling

Security Considerations

Testing

  1. Build & serve locally: build _site/, serve over http://localhost (SWs run on localhost). Confirm in DevTools → Application → Service Workers that sw.js installs and activates, and precache populates under aleph-v1.
  2. Bypass verification: load /login and /verify-email; confirm Network tab shows responses are NOT “from ServiceWorker” and nothing is written to the cache for those paths. Repeat for a same-origin /api/... GET and /health (must NOT be served from SW or cached), for the cross-origin auth.sophiebi.com API, and for any POST. Also confirm a response carrying Cache-Control: no-store/private is never written to the cache.
  3. Static cache: load a page, confirm /css/main.css etc. are served “from ServiceWorker” on repeat load.
  4. Offline: toggle Offline in DevTools; confirm static assets still load from cache and that ANY navigation (even a previously-visited HTML page, since HTML is never cached) falls back to /offline.html.
  5. Installability: run Lighthouse PWA audit; confirm installability criteria pass and the install prompt is available.
  6. Cache invalidation: bump version locally, reload twice, confirm old cache is deleted on activate AND that the new sw.js is actually refetched from the network (not served from HTTP cache) — the production-likely silent-failure scenario.
  7. Automated cache-forbidden regression (highest-risk path): a scripted local browser test (e.g. Playwright, which is available in this environment) that registers the SW, visits each forbidden route — /api/..., /health, /login, /verify-email, /now, /journal, /_protected, /theater — plus an arbitrary same-origin HTML page, then asserts via caches.keys() + cache.match() that NONE of those URLs was stored and that no runtime-cached navigation/content HTML exists — the precached /offline.html is the only permitted cached HTML entry. This converts the “must never cache sensitive content” property from a manual DevTools check into a regression guard that runs on every change to sw.js.

Open Questions

SUBSCRIBE TO RECEIVE POSTS DIRECTLY TO YOUR INBOX