customer-presentation/docs/superpowers/specs/2026-04-27-engagement-tracking-design.md
Arlind Ukshini 4d2ff042e3 docs: drop welcome_cta event from engagement tracking spec
There's only one welcome-step button (#welcome-continue, label "Start
the introduction"); the recent renames were sequential edits to the
same button, not two separate CTAs. The click immediately navigates to
/timeline so the subsequent timeline_view event already captures it.

Also clarified GET / no longer has a timeline branch, and pinned down
how session_id flows into the login event (refactor issueSession to
return its new ID).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:03:53 +02:00

9.5 KiB
Raw Blame History

Engagement Tracking — Design

Date: 2026-04-27 Status: Approved (awaiting implementation plan)

Goal

Capture landmark engagement events on the site so we can answer "who's logging in, what device they're on, do they reach the timeline." Server-side, no client instrumentation. One unified events table, one CLI for reading it.

The existing bifrost_joins table (final-CTA clicks) stays as-is — already has a working CLI and documented schema. Folding it into events would require a migration and breaking bin/joins.js; not worth the churn now. Possible follow-up later if one-stop reporting is wanted.

Events to track

event_type Trigger meta (JSON)
login POST /auth/login returns 200
timeline_view Server-side, on the GET /timeline route {view: "mobile" | "desktop", forced: true | false}forced is true when ?view= query param overrode the UA guess

Failed login attempts (403 not_invited) are out of scope (engagement focus, not security audit). Logout is also out of scope — session lifetime can be inferred from sessions.issued_at if needed. The welcome-step "Start the introduction" button is not tracked separately — it's a single one-button transition that immediately navigates to /timeline, so the click is fully captured by the subsequent timeline_view event. Adding a welcome_cta event would duplicate that signal and require client-side instrumentation for negligible gain.

GET / always serves the entrance shell (public/entrance.html); entrance.js then routes the user client-side to either the email step or the welcome step. There is no server-side dispatch from / to the timeline, so / is not a tracking trigger.

Schema

New table in src/db.js:

CREATE TABLE IF NOT EXISTS events (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  event_type  TEXT    NOT NULL,
  email       TEXT    NOT NULL,
  occurred_at INTEGER NOT NULL,
  session_id  TEXT,
  device_type TEXT,
  os          TEXT,
  browser     TEXT,
  user_agent  TEXT,
  meta        TEXT
);
CREATE INDEX IF NOT EXISTS idx_events_email     ON events(email);
CREATE INDEX IF NOT EXISTS idx_events_type_time ON events(event_type, occurred_at);

Field semantics:

  • event_type — one of login, timeline_view. New types may be added later without schema change.
  • email — lowercased, matches invites.email. Always populated; events without an authenticated user are not recorded.
  • occurred_atDate.now() at insertion time (ms since epoch, matching the rest of the schema).
  • session_id — the session ID. For timeline_view, sourced from req.session.id (the row attached by requireAuth). For login, sourced from the new ID returned by a refactored issueSession() — see the src/sessions.js modification below.
  • device_type — one of mobile, tablet, desktop. Nullable if the UA is missing.
  • os — one of iOS, Android, Windows, macOS, Linux, other. Nullable.
  • browser — one of Safari, Chrome, Firefox, Edge, other. Nullable. Note: Chrome on iOS reports as Safari to the UA detector — acceptable noise; raw UA is preserved for re-parsing.
  • user_agent — raw UA string, stored as fallback so a bad regex can be re-parsed later. Same data is already kept on sessions.user_agent, so no new privacy posture.
  • meta — JSON-encoded object with event-specific fields. NULL if none. Always written via JSON.stringify(...) and read via JSON.parse(...).

No migration needed — CREATE TABLE IF NOT EXISTS is enough on first deploy.

Code layout

New files

src/ua.js — UA parser. ~30 lines, hand-rolled regex, no new dependency.

export function parseUA(ua) {
  // returns { device_type, os, browser } — nullable on missing UA
}

Regex set:

  • device_type: tablet (iPad, Android(?!.*Mobile)) → tablet; existing MOBILE_UA_RE from server.jsmobile; else desktop. Move MOBILE_UA_RE from server.js into src/ua.js and re-export so the existing dispatcher uses the same source of truth.
  • os: iPhone|iPad|iPodiOS; AndroidAndroid; WindowsWindows; Mac OS X|MacintoshmacOS; LinuxLinux; else other.
  • browser: order matters — Edg/Edge; Firefox/Firefox; Chrome/Chrome; Safari/Safari; else other. (Edge before Chrome before Safari; all Chromium UAs include Safari/, all Edge UAs include Chrome/.)

src/events.js — thin recorder.

import { q } from './db.js';
import { parseUA } from './ua.js';

export function recordEvent(req, { type, email, sessionId, meta = null }) {
  const ua = req.headers['user-agent'] || '';
  const { device_type, os, browser } = parseUA(ua);
  q.recordEvent.run(
    type,
    email,
    Date.now(),
    sessionId || null,
    device_type,
    os,
    browser,
    ua || null,
    meta ? JSON.stringify(meta) : null
  );
}

Caller passes sessionId explicitly (rather than the recorder reading req.cookies) because the login event happens before req.cookies reflects the freshly-issued session.

Fire-and-forget — synchronous (better-sqlite3 is sync) and fast enough that no try/catch wrapper is needed. If a future event becomes hot-path, revisit.

bin/events.js — CLI mirroring bin/joins.js. Subcommands:

  • node bin/events.js list [--type <event>] [--limit <n>] — every event, newest first
  • node bin/events.js summary — per-user counts by event type (one row per user, columns: email, logins, timeline_views, last_seen)
  • node bin/events.js for <email> — full event history for one user
  • node bin/events.js stats — totals per event type + unique users per event type + device-type breakdown

Modified files

src/db.js

  • Add the events table + indexes to the db.exec block.
  • Add prepared statements: recordEvent, listEvents, listEventsByType, listEventsForEmail, summariseEvents, countEventsByType, deviceBreakdown.

src/sessions.js

  • Refactor issueSession(req, res, email) to return the new session ID (currently returns nothing). The function already generates the ID internally — just return id; at the bottom. No callers break: existing call sites can ignore the return value.

src/auth.js

  • Capture the returned ID: const sessionId = issueSession(req, res, email);
  • After it succeeds, call recordEvent(req, { type: 'login', email, sessionId }).

server.js

  • Move MOBILE_UA_RE from server.js into src/ua.js and import it back. The wantsMobileView() helper continues to use the same regex.
  • On app.get('/timeline', requireAuth, ...): after the dispatch decision, call recordEvent(req, { type: 'timeline_view', email: req.session.email, sessionId: req.session.id, meta: { view, forced } }). The session row is already attached to req.session by requireAuth — no extra DB lookup. Capture both view ('mobile' or 'desktop') and forced (true if ?view= overrode the UA, otherwise false) before calling wantsMobileView() collapses them.
  • No new endpoints. No /api/track/... route.

No frontend changes. public/entrance.js, protected/, and the timeline assets are untouched.

Docs

CLAUDE.md — under "Common commands":

node bin/events.js list              # read engagement event log
                                     # (also: summary, for <email>, stats)

OPERATIONS.md — short section "Engagement events" mirroring the existing "Bifrost joins" section: what's tracked, how to read, where the data lives.

CHECKLIST.md — add a row to the relevant section: "log in fresh; view timeline on desktop; force ?view=mobile; run node bin/events.js list and confirm rows with correct device + meta fields (login, timeline_view × 2 with forced: true on the second)."

Operational notes

  • No new env vars.
  • No CSP changes — no new client-side fetches.
  • No security-invariant changes:
    • events table is write-only from app code, read-only from CLI.
    • No new public endpoints.
    • Email enumeration posture unchanged.
  • Storage: ~150 bytes/row × low event volume (invite-list-only site) → trivial. No retention/pruning policy needed for the foreseeable future.

Out of scope (explicitly)

  • Failed login tracking
  • Logout tracking
  • Scroll-depth / dwell-time on the timeline
  • Per-section timeline views
  • Folding bifrost_joins into the events table
  • Web UI for viewing events (CLI only, matching joins.js)
  • Bot/crawler filtering — the site is invite-list-only and fully gated, so no bots reach gated routes

Decisions / non-obvious choices

  • Single events table over per-event-type tables — device fields are identical across all event types; splitting would duplicate the schema.
  • bifrost_joins left alone — has a working CLI and existing data; migration cost not justified now.
  • Hand-rolled UA parser over ua-parser-js — project keeps a small dependency footprint; the parsed fields we need are coarse-grained.
  • Raw user_agent stored — same field already lives on sessions, so no new privacy footprint, and lets us re-parse later if a regex misclassifies.
  • issueSession refactored to return the new ID — lets the login event populate session_id properly without reading-back-the-cookie hacks. Tiny change, no callers break.
  • No welcome_cta event — the welcome step has a single one-button transition that immediately navigates to /timeline. The click is fully captured by the subsequent timeline_view event; a separate welcome_cta event would duplicate that signal.