NEXO 7.11.3 — Fingerprint excludes the versions/ snapshot store

Published 2026-04-27. Patch release over v7.11.2 — root-cause fix for the mcp_restart_required lockup that v7.11.2 only masked at the enforcer layer. No API change.

The symptom

After any nexo update that actually changed runtime .py bytes, every non-allowlisted MCP tool started returning the same payload — for hours, then days:

{
  "ok": false,
  "error": "mcp_restart_required",
  "reason": "fingerprint_mismatch",
  "installed_version": "7.11.2",
  "process_version":   "7.11.2"
}

Versions matched. The agent kept calling nexo_reminders, nexo_smart_startup, nexo_guard_check, nexo_task_open … every call bounced. Restarting Claude Code, Codex or Claude Desktop did not help: the new client connected to the same long-lived server.py with the same cached PROCESS_FINGERPRINT. v7.11.2's enforcer gate (HeadlessEnforcer._mcp_restart_pending) silenced the periodic <system-reminder> blocks asking for nexo_* tools, but the marker itself never cleared, so user-driven calls kept failing across sessions.

The root cause

v7.11.0 introduced the runtime fingerprint. The fingerprint is a sha256 over every .py file under src/ the live MCP can import, and the comparison runs in two places:

installed_runtime_fingerprint() — hot path, runs on every tool call via resolve_restart_required. It iterates a candidate list starting with active_runtime_root(), which on a real install resolves through the core/current symlink to core/versions/<active>/.
prime_process_fingerprint() — cold path, runs once at MCP server startup. It iterates a candidate list starting with Path(__file__).resolve().parent, which on a real install is the live core/ directory itself.

Both functions delegate to compute_mcp_runtime_fingerprint(src_dir), which walks the tree and hashes every .py file outside _FINGERPRINT_EXCLUDE_DIRS. v7.11.0 through v7.11.2 shipped this set:

_FINGERPRINT_EXCLUDE_DIRS = frozenset({
    "scripts", "tests", "migrations", "crons",
    "__pycache__", "node_modules", ".git",
})

Notice what's missing: "versions". The retained-snapshot store at core/versions/ — the per-release archive that activate_versioned_runtime_snapshot() writes during every nexo update — was being hashed every time the walker started from live core/. The walker that started from core/versions/<active>/ did not see the rest of the snapshot store at all.

Concretely on a host with three retained snapshots:

installed_runtime_fingerprint:  f8e599fa0cead962…   # hash of core/versions/7.11.2/
prime_process_fingerprint:      8374add662ec5df4…   # hash of core/ (which still contains versions/7.10.0, versions/7.11.0, versions/7.11.2)

The two never matched after the second-ever nexo update on a host. resolve_restart_required() kept seeing installed_fp != process_fp on every call. _ack_current_client_if_restarted() — which would normally clear the marker once the live client and live process agreed — could not clear the marker either, because its own gate checks the same fingerprint comparison and bailed for the same reason. The marker stayed on disk indefinitely.

The fix

One line:

_FINGERPRINT_EXCLUDE_DIRS = frozenset({
    "scripts", "tests", "migrations", "crons",
    "__pycache__", "node_modules", ".git",
    "versions",   # ← v7.11.3
})

Both fingerprint computations now hash exactly the same set of files regardless of which entry path the caller starts from. installed_fp == process_fp the moment a fresh server has loaded the new bytes; _ack_current_client_if_restarted() can clear the marker; non-allowlisted tools resume answering normally.

Why v7.11.2 was not enough

v7.11.2 added HeadlessEnforcer._mcp_restart_pending(), a gate at the top of HeadlessEnforcer._enqueue() that suppresses periodic <system-reminder> prompts asking for nexo_* tools while the marker file exists on disk. That stopped the enforcer from spending agent cycles on guaranteed no-ops, which was real progress — but it never cleared the marker. The user-facing tool calls (nexo_reminders, nexo_task_open, every plugin and skill call) still bounced. The expectation was that the operator would restart the client and the new server would have a clean fingerprint, which is exactly what the bug prevented from happening.

Tests

1 new regression test in tests/test_runtime_fingerprint.py:

test_fingerprint_ignores_versions_subtree — seeds three snapshot directories under versions/ (7.10.0, 7.11.0, 7.11.2) with their own server.py and plugins/memory.py, then asserts compute_mcp_runtime_fingerprint(src) returns the same digest before and after the snapshots are added.

The two existing exclude-dir tests (test_fingerprint_ignores_excluded_directories and the _make_runtime_tree helper) now also include "versions" in their parametrized excluded list, so any future regression that drops "versions" from the set fails fast on existing coverage as well. All 21 tests in tests/test_runtime_fingerprint.py stay green.

Recovery on hosts already in the locked state

After updating to v7.11.3, the on-disk marker at ~/.nexo/runtime/operations/mcp-restart-required.json may still exist from earlier updates. The MCP server clears it on the next tool call once the fingerprints agree. If a stale fingerprint-cache.json at the same operations path is keeping installed_runtime_fingerprint() on the old digest, deleting that file forces a rehash on the next call. Restart any long-lived server.py processes that started before the update so their cached PROCESS_FINGERPRINT is recomputed; without that the process keeps the pre-update digest in memory regardless of what is on disk.

Full changelog entry → · src/runtime_versioning.py · tests/test_runtime_fingerprint.py