Date: Today
System: Alice
Theme: Archiving, storage truth, and learning how production really behaves
What I worked on
Today was about turning “archive homework” from an idea into a production-safe operation.
I built and iterated on an alice:homework:archive command with very strict constraints:
- Dry-run by default
- Reversible where possible
- Explicit logging
- No silent failure
The command does more than just flip a status:
- Validates Google Drive destination
- Copies homework files from R2 to Google Drive
- Handles Shared Drive constraints
- Cleans up R2 only after successful copy
- Supports idempotency (already archived homework becomes cleanup-only)
Once the manual command was stable, I designed the automation layer:
- A scheduled task at 03:00 AM
- Finds homework older than 12 days
- Dispatches archive jobs to the queue (no long work in scheduler)
- All actions logged into a dedicated daily incident log:
storage/logs/homework-archive-YYYY-mm-dd.logwith structured, parseable entries for future reporting.
Problems I hit (and why they mattered)
This was not a “happy path” day.
- Google Drive + Service Accounts are unforgiving
- Service accounts have no storage quota in My Drive.
- Shared Drives are mandatory.
- Even then, permissions behave differently than expected.
- Drive API lies politely
404 File not founddoesn’t always mean “not found”.- It can mean:
- wrong drive context
- missing permission
- operation not allowed (delete) even when create is allowed
- The API surface does not clearly distinguish these cases.
- Preflight checks can become self-sabotage
- I added a “touch test” (create + delete temp file).
- Creation succeeded.
- Deletion failed with 404.
- The system was actually writable — but my preflight logic said “fail”.
- Shared Drive permissions are asymmetric
- A service account may:
- create files
- but not delete or trash them
- That is normal and acceptable.
- The system must not treat that as a hard failure.
- A service account may:
Key changes I made
- Redefined preflight success:
- Create success = writable
- Delete/trash = best-effort, non-blocking
- Improved diagnostics:
- Clearly separated
driveIdvsparentFolderId - Logged exact IDs and operations
- Clearly separated
- Ensured safety guarantees:
- Never delete from R2 unless Drive copy succeeded
- Never archive homework if copying is incomplete
- Added operational structure:
- Queue jobs for long-running work
- Scheduled discovery only
- Single, clean, parseable log per day for incidents
What I learned
- Infrastructure truth beats assumptions
- APIs don’t behave how docs suggest in edge cases.
- Production behavior is the only reliable source.
- Preflight checks must prove capability, not cleanliness
- “Can I write?” matters.
- “Can I clean up?” is secondary.
- Archiving is a lifecycle, not a flag
- Status change
- External storage migration
- Source cleanup
- Idempotency
- Automation
All of these must align or the system rots quietly.
- Good logs are future leverage
- By logging every archive action into a dedicated daily file,
I’ve set up:- future reports
- incident reviews
- confidence in automation
- This is observability, not noise.
- By logging every archive action into a dedicated daily file,
State at end of day
- Manual archive command is operationally correct
- Shared Drive quirks are understood and handled
- R2 cleanup is safe and intentional
- Auto-archive design is ready and disciplined
- The system feels calmer, not more complex
Today wasn’t about speed.
It was about earning trust in a system that deletes data — and that’s the kind of work that actually matters.
