Case Study: Multi-Tenant Agent
One client per machine.Fleet operations with hard boundaries.
We redesigned deployment so each client runs isolated on its own machine with standardized state directories, code-only release artifacts, and fresh target provisioning. Then we removed a dangerous fallback path, remediated leaked data, and shipped a hardening pass with 18 bug fixes plus sensitive-section stripping.
Snapshot
What changed in the platform
Architecture
One-client-per-machine with standardized state layout
Each deployment target now maps to exactly one client runtime. We standardized state directories so every machine follows the same file contract for logs, runtime artifacts, and local operational data. This removed ambiguity in fleet operations and made incident response deterministic.
- Dedicated host per client workload
- Consistent state directory schema across all machines
- No shared process memory or tenant-level fallback lookup
- Operational scripts aligned to the same path structure
Deploy Flow
End-to-end deployment flow after hardening
Build release bundle from source code only
Deploy artifacts now include code and deterministic config templates only. Secrets, runtime state, and memory files are excluded by rule.
Provision target environment from clean baseline
The target machine receives fresh tokens and local configuration during provisioning. Nothing sensitive is copied from a controller machine.
Install gateway with platform-native service flow
Gateway install and start are handled with the standard install path. No custom plist detours. This reduced drift and startup failures.
Run validation and leakage checks
Post-deploy checks confirm tenant path isolation, token locality, and absence of fallback reads from controller-linked data.
Critical Fixes
Removed risky fallback behavior and remediated exposure
Dangerous fallback removed
A legacy fallback path could read controller-side data when target state was missing. That path was removed to enforce strict local-only data resolution per tenant machine.
cleanup-leaked-data remediation
We ran cleanup-leaked-data remediation to purge affected artifacts and reset machine state where needed. This aligned existing environments to the new isolation guarantees.
Sensitive-section stripping
Sensitive blocks are now stripped from deploy-bound outputs. The release pipeline no longer moves sections that can expose private operational context.
Hardening pass shipped
The hardening milestone included 18 bug fixes focused on deployment correctness, state safety, and predictable fleet behavior under failure conditions.
Field Lessons
First real deployment on the Cosmo machine
The first full real-world deployment on the Cosmo machine surfaced practical rollout lessons: enforce standard gateway lifecycle commands, avoid custom service wrappers, keep provisioning idempotent, and validate tenant boundaries before enabling full workload automation. These lessons were folded into the default runbook for future machines.
Result: deployment became repeatable, safer to operate at fleet scale, and easier to audit.
Security Impact
Client isolation is now a default property, not a best effort.
With one-client-per-machine boundaries, code-only deploys, local fresh provisioning, and fallback removal, cross-tenant data leakage paths were eliminated from the deployment model.
Build a hardened deployment model with us