The failure mode

Deep Sleep Phase 2 (extract.py) walks every session from the last 48 hours and asks the configured automation backend to return a JSON summary for each one. When the backend returned non-JSON output — a common semantic failure for large or ambiguous transcripts — the script retried the same session three times with identical prompt and context. The retries almost never succeeded, because the cause was deterministic, not transient.

Worse, the per-attempt safety net was 21600 seconds (6 hours). The comment next to that value read "3h safety net", but the value had drifted out of alignment with the comment across 10 scripts. In the worst observed case, a single unparseable session could stall the entire night's run for up to ~18h before the skip logic finally kicked in.

What 5.5.4 changes

Retries reduced from 3 to 2. Two attempts is enough to cover transient backend hiccups. A second failure is almost always deterministic, so further retries just burn time. Failed sessions are still checkpointed with error: "cannot_comply" and the run continues normally.

JSON escape hatch in the system prompt. The JSON_SYSTEM_PROMPT now instructs the model to return a structured {session_id, findings:[], error:"cannot_comply", reason:"…"} object whenever it cannot comply with the extraction schema for any reason. This guarantees parseable output even on degenerate inputs and removes the most common cause of retry loops.

Unified automation subprocess timeout. A new src/constants.py defines AUTOMATION_SUBPROCESS_TIMEOUT = 10800 (3 hours). All 10 scripts that previously hard-coded 21600extract.py, synthesize.py, nexo-evolution-run.py, nexo-agent-run.py, nexo-synthesis.py, nexo-sleep.py, nexo-catchup.py, nexo-immune.py, nexo-daily-self-audit.py, nexo-postmortem-consolidator.py — now import the shared constant. Future tuning is one edit, not ten.

Why this matters

Deep Sleep is the cron that carries every learning, pattern, and context packet from one working day to the next. When Phase 2 blocks, the morning briefing, the applied-findings pipeline, and the weekly summaries all run late or not at all. 5.5.4 turns a 10-script coordination footgun into a single-source-of-truth constant and gives the model an honest way to report "I cannot summarise this" without stalling the whole night.