There's only one welcome-step button (#welcome-continue, label "Start the introduction"); the recent renames were sequential edits to the same button, not two separate CTAs. The click immediately navigates to /timeline so the subsequent timeline_view event already captures it. Also clarified GET / no longer has a timeline branch, and pinned down how session_id flows into the login event (refactor issueSession to return its new ID). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
168 lines
9.5 KiB
Markdown
168 lines
9.5 KiB
Markdown
# Engagement Tracking — Design
|
||
|
||
**Date:** 2026-04-27
|
||
**Status:** Approved (awaiting implementation plan)
|
||
|
||
## Goal
|
||
|
||
Capture landmark engagement events on the site so we can answer "who's logging in, what device they're on, do they reach the timeline." Server-side, no client instrumentation. One unified `events` table, one CLI for reading it.
|
||
|
||
The existing `bifrost_joins` table (final-CTA clicks) stays as-is — already has a working CLI and documented schema. Folding it into `events` would require a migration and breaking `bin/joins.js`; not worth the churn now. Possible follow-up later if one-stop reporting is wanted.
|
||
|
||
## Events to track
|
||
|
||
| `event_type` | Trigger | `meta` (JSON) |
|
||
|---|---|---|
|
||
| `login` | `POST /auth/login` returns 200 | — |
|
||
| `timeline_view` | Server-side, on the `GET /timeline` route | `{view: "mobile" \| "desktop", forced: true \| false}` — `forced` is true when `?view=` query param overrode the UA guess |
|
||
|
||
Failed login attempts (`403 not_invited`) are out of scope (engagement focus, not security audit). Logout is also out of scope — session lifetime can be inferred from `sessions.issued_at` if needed. The welcome-step "Start the introduction" button is **not** tracked separately — it's a single one-button transition that immediately navigates to `/timeline`, so the click is fully captured by the subsequent `timeline_view` event. Adding a `welcome_cta` event would duplicate that signal and require client-side instrumentation for negligible gain.
|
||
|
||
`GET /` always serves the entrance shell (`public/entrance.html`); `entrance.js` then routes the user client-side to either the email step or the welcome step. There is no server-side dispatch from `/` to the timeline, so `/` is not a tracking trigger.
|
||
|
||
## Schema
|
||
|
||
New table in `src/db.js`:
|
||
|
||
```sql
|
||
CREATE TABLE IF NOT EXISTS events (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
event_type TEXT NOT NULL,
|
||
email TEXT NOT NULL,
|
||
occurred_at INTEGER NOT NULL,
|
||
session_id TEXT,
|
||
device_type TEXT,
|
||
os TEXT,
|
||
browser TEXT,
|
||
user_agent TEXT,
|
||
meta TEXT
|
||
);
|
||
CREATE INDEX IF NOT EXISTS idx_events_email ON events(email);
|
||
CREATE INDEX IF NOT EXISTS idx_events_type_time ON events(event_type, occurred_at);
|
||
```
|
||
|
||
Field semantics:
|
||
|
||
- `event_type` — one of `login`, `timeline_view`. New types may be added later without schema change.
|
||
- `email` — lowercased, matches `invites.email`. Always populated; events without an authenticated user are not recorded.
|
||
- `occurred_at` — `Date.now()` at insertion time (ms since epoch, matching the rest of the schema).
|
||
- `session_id` — the session ID. For `timeline_view`, sourced from `req.session.id` (the row attached by `requireAuth`). For `login`, sourced from the new ID returned by a refactored `issueSession()` — see the `src/sessions.js` modification below.
|
||
- `device_type` — one of `mobile`, `tablet`, `desktop`. Nullable if the UA is missing.
|
||
- `os` — one of `iOS`, `Android`, `Windows`, `macOS`, `Linux`, `other`. Nullable.
|
||
- `browser` — one of `Safari`, `Chrome`, `Firefox`, `Edge`, `other`. Nullable. Note: Chrome on iOS reports as Safari to the UA detector — acceptable noise; raw UA is preserved for re-parsing.
|
||
- `user_agent` — raw UA string, stored as fallback so a bad regex can be re-parsed later. Same data is already kept on `sessions.user_agent`, so no new privacy posture.
|
||
- `meta` — JSON-encoded object with event-specific fields. `NULL` if none. Always written via `JSON.stringify(...)` and read via `JSON.parse(...)`.
|
||
|
||
No migration needed — `CREATE TABLE IF NOT EXISTS` is enough on first deploy.
|
||
|
||
## Code layout
|
||
|
||
### New files
|
||
|
||
**`src/ua.js`** — UA parser. ~30 lines, hand-rolled regex, no new dependency.
|
||
|
||
```js
|
||
export function parseUA(ua) {
|
||
// returns { device_type, os, browser } — nullable on missing UA
|
||
}
|
||
```
|
||
|
||
Regex set:
|
||
- `device_type`: tablet (`iPad`, `Android(?!.*Mobile)`) → `tablet`; existing `MOBILE_UA_RE` from `server.js` → `mobile`; else `desktop`. Move `MOBILE_UA_RE` from `server.js` into `src/ua.js` and re-export so the existing dispatcher uses the same source of truth.
|
||
- `os`: `iPhone|iPad|iPod` → `iOS`; `Android` → `Android`; `Windows` → `Windows`; `Mac OS X|Macintosh` → `macOS`; `Linux` → `Linux`; else `other`.
|
||
- `browser`: order matters — `Edg/` → `Edge`; `Firefox/` → `Firefox`; `Chrome/` → `Chrome`; `Safari/` → `Safari`; else `other`. (Edge before Chrome before Safari; all Chromium UAs include `Safari/`, all Edge UAs include `Chrome/`.)
|
||
|
||
**`src/events.js`** — thin recorder.
|
||
|
||
```js
|
||
import { q } from './db.js';
|
||
import { parseUA } from './ua.js';
|
||
|
||
export function recordEvent(req, { type, email, sessionId, meta = null }) {
|
||
const ua = req.headers['user-agent'] || '';
|
||
const { device_type, os, browser } = parseUA(ua);
|
||
q.recordEvent.run(
|
||
type,
|
||
email,
|
||
Date.now(),
|
||
sessionId || null,
|
||
device_type,
|
||
os,
|
||
browser,
|
||
ua || null,
|
||
meta ? JSON.stringify(meta) : null
|
||
);
|
||
}
|
||
```
|
||
|
||
Caller passes `sessionId` explicitly (rather than the recorder reading `req.cookies`) because the `login` event happens before `req.cookies` reflects the freshly-issued session.
|
||
|
||
Fire-and-forget — synchronous (better-sqlite3 is sync) and fast enough that no try/catch wrapper is needed. If a future event becomes hot-path, revisit.
|
||
|
||
**`bin/events.js`** — CLI mirroring `bin/joins.js`. Subcommands:
|
||
|
||
- `node bin/events.js list [--type <event>] [--limit <n>]` — every event, newest first
|
||
- `node bin/events.js summary` — per-user counts by event type (one row per user, columns: email, logins, timeline_views, last_seen)
|
||
- `node bin/events.js for <email>` — full event history for one user
|
||
- `node bin/events.js stats` — totals per event type + unique users per event type + device-type breakdown
|
||
|
||
### Modified files
|
||
|
||
**`src/db.js`**
|
||
- Add the `events` table + indexes to the `db.exec` block.
|
||
- Add prepared statements: `recordEvent`, `listEvents`, `listEventsByType`, `listEventsForEmail`, `summariseEvents`, `countEventsByType`, `deviceBreakdown`.
|
||
|
||
**`src/sessions.js`**
|
||
- Refactor `issueSession(req, res, email)` to **return the new session ID** (currently returns nothing). The function already generates the ID internally — just `return id;` at the bottom. No callers break: existing call sites can ignore the return value.
|
||
|
||
**`src/auth.js`**
|
||
- Capture the returned ID: `const sessionId = issueSession(req, res, email);`
|
||
- After it succeeds, call `recordEvent(req, { type: 'login', email, sessionId })`.
|
||
|
||
**`server.js`**
|
||
- Move `MOBILE_UA_RE` from `server.js` into `src/ua.js` and import it back. The `wantsMobileView()` helper continues to use the same regex.
|
||
- On `app.get('/timeline', requireAuth, ...)`: after the dispatch decision, call `recordEvent(req, { type: 'timeline_view', email: req.session.email, sessionId: req.session.id, meta: { view, forced } })`. The session row is already attached to `req.session` by `requireAuth` — no extra DB lookup. Capture both `view` (`'mobile'` or `'desktop'`) and `forced` (`true` if `?view=` overrode the UA, otherwise `false`) before calling `wantsMobileView()` collapses them.
|
||
- No new endpoints. No `/api/track/...` route.
|
||
|
||
**No frontend changes.** `public/entrance.js`, `protected/`, and the timeline assets are untouched.
|
||
|
||
### Docs
|
||
|
||
**`CLAUDE.md`** — under "Common commands":
|
||
```
|
||
node bin/events.js list # read engagement event log
|
||
# (also: summary, for <email>, stats)
|
||
```
|
||
|
||
**`OPERATIONS.md`** — short section "Engagement events" mirroring the existing "Bifrost joins" section: what's tracked, how to read, where the data lives.
|
||
|
||
**`CHECKLIST.md`** — add a row to the relevant section: "log in fresh; view timeline on desktop; force `?view=mobile`; run `node bin/events.js list` and confirm rows with correct device + meta fields (login, timeline_view × 2 with `forced: true` on the second)."
|
||
|
||
## Operational notes
|
||
|
||
- No new env vars.
|
||
- No CSP changes — no new client-side fetches.
|
||
- No security-invariant changes:
|
||
- `events` table is write-only from app code, read-only from CLI.
|
||
- No new public endpoints.
|
||
- Email enumeration posture unchanged.
|
||
- Storage: ~150 bytes/row × low event volume (invite-list-only site) → trivial. No retention/pruning policy needed for the foreseeable future.
|
||
|
||
## Out of scope (explicitly)
|
||
|
||
- Failed login tracking
|
||
- Logout tracking
|
||
- Scroll-depth / dwell-time on the timeline
|
||
- Per-section timeline views
|
||
- Folding `bifrost_joins` into the `events` table
|
||
- Web UI for viewing events (CLI only, matching `joins.js`)
|
||
- Bot/crawler filtering — the site is invite-list-only and fully gated, so no bots reach gated routes
|
||
|
||
## Decisions / non-obvious choices
|
||
|
||
- **Single `events` table** over per-event-type tables — device fields are identical across all event types; splitting would duplicate the schema.
|
||
- **`bifrost_joins` left alone** — has a working CLI and existing data; migration cost not justified now.
|
||
- **Hand-rolled UA parser** over `ua-parser-js` — project keeps a small dependency footprint; the parsed fields we need are coarse-grained.
|
||
- **Raw `user_agent` stored** — same field already lives on `sessions`, so no new privacy footprint, and lets us re-parse later if a regex misclassifies.
|
||
- **`issueSession` refactored to return the new ID** — lets the `login` event populate `session_id` properly without reading-back-the-cookie hacks. Tiny change, no callers break.
|
||
- **No `welcome_cta` event** — the welcome step has a single one-button transition that immediately navigates to `/timeline`. The click is fully captured by the subsequent `timeline_view` event; a separate `welcome_cta` event would duplicate that signal.
|