Most cloud providers ship browser SSH the same way: a small daemon runs on every VM, connects out to the control plane over a websocket, and proxies a PTY through it. It works. It also means every customer VM has an extra process you can't audit, a long-lived outbound connection, and a vector for the platform to escalate into the tenant.

CloudNx CloudShell deliberately runs no agent inside the customer VM. Here's how it works and what we gave up to get there.

The constraint

Browser SSH for us has to:

Open in <2 seconds from Console click to a usable prompt.
Be auditable end-to-end — every command logged with the right user_id.
Add zero attack surface inside the VM. No daemon, no listener, no out-of-band tunnel.
Survive across customer-initiated reboots without re-onboarding.

The architecture

We exploit the fact that we already operate the host (Proxmox). Every customer VM has a private IP on 10.10.0.0/24and an SSH port-forward managed by the NAT reconciler. The host can SSH into the VM directly using a platform key that we install during provisioning — once, idempotent, scoped to cnx-platform.

When the user clicks Console, the compute service:

Validates the JWT and the user's IAM permission for that instance.
Mints a short-lived (5 min) access token for the websocket.
Spawns a containerized SSH client that connects to the VM with the platform key, pty + stderr captured.
Pipes the websocket to the SSH client's stdin/stdout. Browser ↔ websocket ↔ SSH ↔ VM.

What we gave up

Resilience-to-host-compromise. If our host is breached, the attacker has the platform key — they can SSH into every customer VM. We document this. The agent-based approach has the same weakness via the daemon's outbound channel, but at least the tenant could observe traffic from inside; ours is fully invisible to them.

We accept this trade because the alternative — running an agent — doesn't actually solve the underlying trust assumption (you already trust the platform to run your code). It just adds surface.

What we got

Sub-second open times. Zero footprint inside customer VMs. Audit logs that match exactly what was typed because the host wraps every command. And one less thing to maintain across kernel versions, distros, and customer-installed firewalls.

If you've ever debugged a flaky ssm-agent on Amazon Linux 2 or chased an out-of-memory ec2-instance-connect, this trade-off probably makes sense to you.