Technical Journal — Incident Hub & Homework Proof Failures

Date: 2026-01-25 → 2026-01-26
Project: Alice
Area: Customers > Homework, Admin > Insights


1. Problem Being Solved

Customers occasionally fail to upload homework proof images/videos due to:

  • network issues
  • backend upload errors
  • unexpected frontend edge cases
  • ffmpeg / processing failures downstream

Previously:

  • Customers see the failure
  • Errors are logged in homework-proof logs
  • BUT no structured, searchable, admin-level visibility
  • Debugging required SSH + manual log scanning

➡️ Goal: turn silent frontend failures into first-class incidents that admins can inspect without reading logs.


2. Existing System (Baseline)

Homework Proof Upload

  • Sequential upload (file-by-file)
  • Each file isolated (1 failure ≠ block others)
  • Accepts images + videos (image/*,video/*)
  • Android safe
  • UI is DB-driven
  • Sticky error toasts already implemented

Video Processing

  • Original uploaded to Cloudflare R2
  • ffmpeg runs only in queue
  • Status flow:
    • uploaded
    • scheduled
    • processing
    • ready
    • failed
  • If failed:
    • keep original
    • no auto retry

Logging

  • Dedicated channel: homework-proof
  • Structured logs
  • correlation_id per file/action
  • Used for:
    • uploads
    • progress updates
    • video jobs

3. New Concept Introduced: Incident Hub

Purpose

central admin-only hub for frontend-captured errors.

This is not:

  • log viewer
  • queue inspector
  • monitoring dashboard

This is:

  • structured error collection
  • searchable
  • DB-backed
  • production-safe

4. Incident Hub – Data Model

Minimum fields

  • id
  • error_key (string, indexed)
  • description (JSON)
  • created_at
  • updated_at

Practical extensions (allowed)

  • count
  • first_seen_at
  • last_seen_at
  • environment
  • url
  • route_name
  • user_id
  • user_agent
  • release

Design principle: schema is flexible, JSON carries context


5. Incident Hub – Admin UI

Path: Admin > Insights > Incident Hub

List View

  • error_key
  • count (if present)
  • last_seen_at / updated_at
  • created_at
  • pagination
  • search by error_key

Detail View

  • error_key
  • description rendered as formatted JSON
  • read-only
  • delete allowed (admin only)

6. New Behavior Added (Key Change)

Trigger Point

Customers > Homework uploader

Whenever ANY image or video upload fails, regardless of reason.

Failure definition

  • non-2xx response
  • network / timeout
  • backend error response
  • malformed response

7. Incident Reporting Rules

error_key

homework_proof

description (JSON)

Must include:

{
  "customer_id": 68,
  "customer_name": "Nguyen Van A",
  "student_id": 123,
  "file_name": "IMG_2392.mov",
  "error_message": "Upload failed: network timeout"
}

Optional (best effort)

  • skill
  • homework_id
  • homework_skill_id
  • http_status
  • route_name
  • correlation_id
  • user_agent

8. Frontend Constraints (Important)

  • Incident reporting is fire-and-forget
  • Must never block:
    • UI
    • other uploads
  • If incident ingestion fails:
    • silently ignore
    • NO extra toast
    • NO user-visible change
  • Existing upload error UX remains unchanged

9. Why This Design Works

  • Frontend becomes an early-warning sensor
  • Admins see patterns instead of isolated logs
  • Logs (homework-proof) remain source for deep debugging
  • Incident Hub becomes triage layer
  • No coupling to queue internals
  • No polling
  • No new auth system

10. Real Case That Triggered This

  • File: IMG_2392.mov
  • Customer ID: 68
  • Required manual log inspection:
    • storage/logs/homework-proof-2026-01-25.log
  • This incident directly motivated:
    • Incident Hub
    • frontend failure reporting

11. Mental Model Going Forward

  • Logs → for engineers
  • Incident Hub → for operators / admins
  • UI status → for customers
  • Each layer has a clear responsibility