Files system (Lumen)
What I worked on
Today was deep plumbing work around EXIF extraction, colour profile correctness, and observability in the Files pipeline.
- Built and iterated on a new CLI command:files:exif:reextract {image_uuid}This command can re-run EXIF extraction for a single image, print the full EXIF payload, and optionally persist it back to upload_sessions. It’s explicit, safe by default, and designed for investigation rather than silent repair.
- Extended the single-image bordered endpoint to expose full original EXIF metadata (read-only), keeping the API aligned with what the pipeline actually knows, not what we assume.
- Investigated why some images were being classified as sRGB when operationally they should be treated as RGB.
- Added a dedicated investigation log (bordered-investigation.log) to break out of the “safe but opaque” logging trap. This log is intentionally verbose and isolated, so I can see what path was resolved, for which image, and why—without polluting production logs.
Key bug / edge case discovered
Using a real image and raw exiftool output, I confirmed:
- ExifIFD.ColorSpace reports sRGB
- But ICC-header.ColorSpaceData reports RGB (with trailing whitespace)
The system was trusting ExifIFD.ColorSpace by default, which is technically valid metadata—but operationally misleading for our pipeline. The ICC header is the stronger signal here.
Fix applied (domain rule)
- Gave top priority to ICC-header.ColorSpaceData (after trimming).
- If it exists and equals RGB, the image is classified as RGB, regardless of ExifIFD.ColorSpace.
- ExifIFD.ColorSpace is now a fallback, not the authority.
- Normalization logic is shared so:
- CLI output
- Stored normalized metadata
- API responsesall agree.
This corrected the command output immediately and aligned the system with real-world colour handling expectations.
What I learned / reinforced
- EXIF is not a single truth; it’s a layered signal system. You must decide which layer you trust and document why.
- “Safe logging” without decision context is worse than no logging—it hides bugs behind correctness.
- Investigation tools should be:
- explicit
- gated
- brutally honesteven if they’re ugly.
- Colour profiles are not just metadata trivia; they materially affect downstream printing and borders. Getting them wrong silently is unacceptable.
State at end of day
- EXIF re-extraction is reproducible and inspectable.
- Colour profile normalization now matches ICC reality, not surface EXIF hints.
- I can finally see what the system is doing when something looks wrong.
This was a good day: fewer assumptions, more ground truth.
