Alice Dev — 05/02/2026

Alice Dev — 05/02/2026

Homework archiving, queues, and learning to respect production gravity

Today was one of those days where the system teaches you humility.

What I worked on

I continued hardening the homework archiving pipeline in Alice, moving it from “works manually” toward “safe to automate”.

The manual command alice:homework:archive is now in a good place:

  • Dry-run by default, forcing intention before action.
  • Fully supports Google Drive Shared Drives, not just My Drive.
  • Copies files from R2 → Drive, then verifies before doing anything destructive.
  • R2 cleanup happens only after verified copy, and is explicitly best-effort.
  • Re-running the command is safe: already-archived homework is detected and skipped.

While testing this against real data, I ran straight into real-world Google Drive behavior:

  • Service accounts can create files but not delete them in Shared Drives.
  • Permission errors often surface as misleading 404 Not Found.
  • A failed delete does not mean the drive is unwritable.

This forced a rethink of “preflight checks”.

I corrected the logic so that:

  • Successful create == writable
  • Cleanup failures never block archiving
  • Deletion is treated as high-risk and non-blocking

Once the manual path felt calm, I designed the scheduled auto-archive flow:

  • Runs daily at 03:00 AM (Asia/Bangkok).
  • Discovers homework older than 12 days.
  • Dispatches archive jobs to the queue.
  • Everything logs into a dedicated, structured log file:
storage/logs/homework-archive-YYYY-mm-dd.log
  • Clear event-based logs (discovery.start, job.copy_ok, job.cleanup_failed, etc.) instead of noisy dumps.

In parallel, I investigated a production queue crash:

Cannot assign null to property ProcessVideoProof::$proofCode language: PHP (php)

This turned out to be a classic queue + typed property footgun:

  • The job assumed a model would always exist.
  • In reality, records can disappear between dispatch and execution.
  • PHP typed properties don’t forgive that assumption.

I drafted a fix plan:

  • Never store live models in job state.
  • Load by ID inside handle().
  • Treat missing records as non-retryable skips, not fatal errors.
  • Log calmly and exit cleanly.

What I learned

  1. Automation amplifies mistakesAnything destructive must be correct before it is automatic. Dry-run isn’t a feature — it’s a mindset.
  2. Cloud APIs lie by omissionA 404 doesn’t always mean “not found”. In shared systems, it often means “you’re not allowed to know”.
  3. Deletion is a privilege, not a rightSystems should be designed so cleanup can fail without causing harm.
  4. Queues demand defensive codingBetween dispatch and execution, the world changes. Typed properties make this painfully explicit.
  5. Operational calm is a real design goalClear logs, stable event names, idempotent behavior — these reduce cognitive load far more than clever abstractions.

Where the system stands now

  • Manual homework archiving: stable
  • Shared Drive edge cases: understood and handled
  • R2 cleanup: safe, intentional, non-blocking
  • Scheduled auto-archive: designed, ready to implement
  • Observability: first-class, not an afterthought

This was not a “build fast” day.

It was a “make it survive February, March, and next year” day.