customer-presentation/docs/superpowers/specs/2026-04-27-engagement-tracking-design.md
Arlind Ukshini 4d2ff042e3 docs: drop welcome_cta event from engagement tracking spec
There's only one welcome-step button (#welcome-continue, label "Start
the introduction"); the recent renames were sequential edits to the
same button, not two separate CTAs. The click immediately navigates to
/timeline so the subsequent timeline_view event already captures it.

Also clarified GET / no longer has a timeline branch, and pinned down
how session_id flows into the login event (refactor issueSession to
return its new ID).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 10:03:53 +02:00

168 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Engagement Tracking — Design
**Date:** 2026-04-27
**Status:** Approved (awaiting implementation plan)
## Goal
Capture landmark engagement events on the site so we can answer "who's logging in, what device they're on, do they reach the timeline." Server-side, no client instrumentation. One unified `events` table, one CLI for reading it.
The existing `bifrost_joins` table (final-CTA clicks) stays as-is — already has a working CLI and documented schema. Folding it into `events` would require a migration and breaking `bin/joins.js`; not worth the churn now. Possible follow-up later if one-stop reporting is wanted.
## Events to track
| `event_type` | Trigger | `meta` (JSON) |
|---|---|---|
| `login` | `POST /auth/login` returns 200 | — |
| `timeline_view` | Server-side, on the `GET /timeline` route | `{view: "mobile" \| "desktop", forced: true \| false}``forced` is true when `?view=` query param overrode the UA guess |
Failed login attempts (`403 not_invited`) are out of scope (engagement focus, not security audit). Logout is also out of scope — session lifetime can be inferred from `sessions.issued_at` if needed. The welcome-step "Start the introduction" button is **not** tracked separately — it's a single one-button transition that immediately navigates to `/timeline`, so the click is fully captured by the subsequent `timeline_view` event. Adding a `welcome_cta` event would duplicate that signal and require client-side instrumentation for negligible gain.
`GET /` always serves the entrance shell (`public/entrance.html`); `entrance.js` then routes the user client-side to either the email step or the welcome step. There is no server-side dispatch from `/` to the timeline, so `/` is not a tracking trigger.
## Schema
New table in `src/db.js`:
```sql
CREATE TABLE IF NOT EXISTS events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
event_type TEXT NOT NULL,
email TEXT NOT NULL,
occurred_at INTEGER NOT NULL,
session_id TEXT,
device_type TEXT,
os TEXT,
browser TEXT,
user_agent TEXT,
meta TEXT
);
CREATE INDEX IF NOT EXISTS idx_events_email ON events(email);
CREATE INDEX IF NOT EXISTS idx_events_type_time ON events(event_type, occurred_at);
```
Field semantics:
- `event_type` — one of `login`, `timeline_view`. New types may be added later without schema change.
- `email` — lowercased, matches `invites.email`. Always populated; events without an authenticated user are not recorded.
- `occurred_at``Date.now()` at insertion time (ms since epoch, matching the rest of the schema).
- `session_id` — the session ID. For `timeline_view`, sourced from `req.session.id` (the row attached by `requireAuth`). For `login`, sourced from the new ID returned by a refactored `issueSession()` — see the `src/sessions.js` modification below.
- `device_type` — one of `mobile`, `tablet`, `desktop`. Nullable if the UA is missing.
- `os` — one of `iOS`, `Android`, `Windows`, `macOS`, `Linux`, `other`. Nullable.
- `browser` — one of `Safari`, `Chrome`, `Firefox`, `Edge`, `other`. Nullable. Note: Chrome on iOS reports as Safari to the UA detector — acceptable noise; raw UA is preserved for re-parsing.
- `user_agent` — raw UA string, stored as fallback so a bad regex can be re-parsed later. Same data is already kept on `sessions.user_agent`, so no new privacy posture.
- `meta` — JSON-encoded object with event-specific fields. `NULL` if none. Always written via `JSON.stringify(...)` and read via `JSON.parse(...)`.
No migration needed — `CREATE TABLE IF NOT EXISTS` is enough on first deploy.
## Code layout
### New files
**`src/ua.js`** — UA parser. ~30 lines, hand-rolled regex, no new dependency.
```js
export function parseUA(ua) {
// returns { device_type, os, browser } — nullable on missing UA
}
```
Regex set:
- `device_type`: tablet (`iPad`, `Android(?!.*Mobile)`) → `tablet`; existing `MOBILE_UA_RE` from `server.js``mobile`; else `desktop`. Move `MOBILE_UA_RE` from `server.js` into `src/ua.js` and re-export so the existing dispatcher uses the same source of truth.
- `os`: `iPhone|iPad|iPod``iOS`; `Android``Android`; `Windows``Windows`; `Mac OS X|Macintosh``macOS`; `Linux``Linux`; else `other`.
- `browser`: order matters — `Edg/``Edge`; `Firefox/``Firefox`; `Chrome/``Chrome`; `Safari/``Safari`; else `other`. (Edge before Chrome before Safari; all Chromium UAs include `Safari/`, all Edge UAs include `Chrome/`.)
**`src/events.js`** — thin recorder.
```js
import { q } from './db.js';
import { parseUA } from './ua.js';
export function recordEvent(req, { type, email, sessionId, meta = null }) {
const ua = req.headers['user-agent'] || '';
const { device_type, os, browser } = parseUA(ua);
q.recordEvent.run(
type,
email,
Date.now(),
sessionId || null,
device_type,
os,
browser,
ua || null,
meta ? JSON.stringify(meta) : null
);
}
```
Caller passes `sessionId` explicitly (rather than the recorder reading `req.cookies`) because the `login` event happens before `req.cookies` reflects the freshly-issued session.
Fire-and-forget — synchronous (better-sqlite3 is sync) and fast enough that no try/catch wrapper is needed. If a future event becomes hot-path, revisit.
**`bin/events.js`** — CLI mirroring `bin/joins.js`. Subcommands:
- `node bin/events.js list [--type <event>] [--limit <n>]` — every event, newest first
- `node bin/events.js summary` — per-user counts by event type (one row per user, columns: email, logins, timeline_views, last_seen)
- `node bin/events.js for <email>` — full event history for one user
- `node bin/events.js stats` — totals per event type + unique users per event type + device-type breakdown
### Modified files
**`src/db.js`**
- Add the `events` table + indexes to the `db.exec` block.
- Add prepared statements: `recordEvent`, `listEvents`, `listEventsByType`, `listEventsForEmail`, `summariseEvents`, `countEventsByType`, `deviceBreakdown`.
**`src/sessions.js`**
- Refactor `issueSession(req, res, email)` to **return the new session ID** (currently returns nothing). The function already generates the ID internally — just `return id;` at the bottom. No callers break: existing call sites can ignore the return value.
**`src/auth.js`**
- Capture the returned ID: `const sessionId = issueSession(req, res, email);`
- After it succeeds, call `recordEvent(req, { type: 'login', email, sessionId })`.
**`server.js`**
- Move `MOBILE_UA_RE` from `server.js` into `src/ua.js` and import it back. The `wantsMobileView()` helper continues to use the same regex.
- On `app.get('/timeline', requireAuth, ...)`: after the dispatch decision, call `recordEvent(req, { type: 'timeline_view', email: req.session.email, sessionId: req.session.id, meta: { view, forced } })`. The session row is already attached to `req.session` by `requireAuth` — no extra DB lookup. Capture both `view` (`'mobile'` or `'desktop'`) and `forced` (`true` if `?view=` overrode the UA, otherwise `false`) before calling `wantsMobileView()` collapses them.
- No new endpoints. No `/api/track/...` route.
**No frontend changes.** `public/entrance.js`, `protected/`, and the timeline assets are untouched.
### Docs
**`CLAUDE.md`** — under "Common commands":
```
node bin/events.js list # read engagement event log
# (also: summary, for <email>, stats)
```
**`OPERATIONS.md`** — short section "Engagement events" mirroring the existing "Bifrost joins" section: what's tracked, how to read, where the data lives.
**`CHECKLIST.md`** — add a row to the relevant section: "log in fresh; view timeline on desktop; force `?view=mobile`; run `node bin/events.js list` and confirm rows with correct device + meta fields (login, timeline_view × 2 with `forced: true` on the second)."
## Operational notes
- No new env vars.
- No CSP changes — no new client-side fetches.
- No security-invariant changes:
- `events` table is write-only from app code, read-only from CLI.
- No new public endpoints.
- Email enumeration posture unchanged.
- Storage: ~150 bytes/row × low event volume (invite-list-only site) → trivial. No retention/pruning policy needed for the foreseeable future.
## Out of scope (explicitly)
- Failed login tracking
- Logout tracking
- Scroll-depth / dwell-time on the timeline
- Per-section timeline views
- Folding `bifrost_joins` into the `events` table
- Web UI for viewing events (CLI only, matching `joins.js`)
- Bot/crawler filtering — the site is invite-list-only and fully gated, so no bots reach gated routes
## Decisions / non-obvious choices
- **Single `events` table** over per-event-type tables — device fields are identical across all event types; splitting would duplicate the schema.
- **`bifrost_joins` left alone** — has a working CLI and existing data; migration cost not justified now.
- **Hand-rolled UA parser** over `ua-parser-js` — project keeps a small dependency footprint; the parsed fields we need are coarse-grained.
- **Raw `user_agent` stored** — same field already lives on `sessions`, so no new privacy footprint, and lets us re-parse later if a regex misclassifies.
- **`issueSession` refactored to return the new ID** — lets the `login` event populate `session_id` properly without reading-back-the-cookie hacks. Tiny change, no callers break.
- **No `welcome_cta` event** — the welcome step has a single one-button transition that immediately navigates to `/timeline`. The click is fully captured by the subsequent `timeline_view` event; a separate `welcome_cta` event would duplicate that signal.