Skip to content

Chapter 4: Network & Proxy


Your Server Needs Internet Too

You've got SSH working. You can connect to your server from your phone. Your tmux sessions survive disconnections. The infrastructure is coming together.

But here's something that trips up nearly every researcher at some point: your server needs to talk to the outside world, not just to you. It needs to download model weights from HuggingFace. It needs to pull datasets. It needs pip install and conda install to actually resolve packages. It needs to upload metrics to Weights & Biases.

Some lab servers have unrestricted internet access. Lucky you — skip the proxy section later in this chapter. But many don't. University firewalls, corporate networks, air-gapped clusters — there are plenty of reasons your server might not be able to reach huggingface.co or pypi.org directly.

And even if your server does have internet access, there's another problem that has nothing to do with firewalls: SSH connection management. Every time you type ssh your-server, your machine opens a brand-new TCP connection, does a full key exchange, and authenticates from scratch. When you're running Claude Code — which might issue dozens of SSH commands per minute to check GPU status, sync files, read logs, and run training scripts — that's dozens of parallel connections. It's slow, it hammers the server's SSH daemon, and sometimes the server just starts rejecting connections because you've hit the rate limit.

This chapter solves both problems. First, we'll fix SSH multiplexing so all your connections share a single tunnel. Then, for those who need it, we'll set up a proxy so your server can reach the internet through your local machine.


SSH ControlMaster (Everyone Needs This)

This section isn't optional. Whether your server has internet access or not, whether you use a proxy or not, ControlMaster makes every SSH interaction faster and more reliable. Set it up now.

The Problem

Without ControlMaster, every SSH command is independent. When Claude Code runs ssh your-server "nvidia-smi", that's:

  1. Open a TCP connection
  2. Negotiate encryption
  3. Authenticate (key exchange)
  4. Run the command
  5. Close the connection

Now Claude Code runs ssh your-server "cat /tmp/training.log". Same thing — full connection setup, all over again. And again for ssh your-server "tmux list-sessions". And again for rsync. And again for every health check.

Each connection takes 1-3 seconds to establish. That latency adds up fast. Worse, some servers limit concurrent SSH connections. You'll start seeing ssh_exchange_identification: Connection closed by remote host errors, and your automation grinds to a halt.

The Fix

SSH ControlMaster lets you establish one master connection, and then every subsequent SSH command to the same server piggybacks on that connection. No new TCP handshake, no new authentication. The second ssh command is essentially instant.

Open your SSH config file on your local machine:

bash
mkdir -p ~/.ssh/sockets

Then add this to ~/.ssh/config (or edit the existing Host * block):

Host *
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p
    ControlPersist yes
    ServerAliveInterval 60
    ServerAliveCountMax 3

That's it. Let's break down what each line does.

ControlMaster auto — The first SSH connection to a server automatically becomes the master. You don't have to think about it. Subsequent connections detect the master and use it.

ControlPath ~/.ssh/sockets/%r@%h-%p — This is the Unix socket file that the master connection creates. The %r@%h-%p pattern means it's unique per user, host, and port — so connections to different servers don't interfere with each other. We put them in ~/.ssh/sockets/ to keep things tidy.

ControlPersist yes — This is the important one. When you close the terminal that started the master connection, the master stays alive in the background. Without this, closing your terminal would kill the master and break every other connection using it. With yes, the master persists indefinitely — until you explicitly stop it or the server drops the connection.

ServerAliveInterval 60 — Send a keepalive packet every 60 seconds. This prevents firewalls and NATs from killing idle connections. Without this, a connection that's been quiet for a few minutes might silently die.

ServerAliveCountMax 3 — If 3 consecutive keepalive packets get no response, declare the connection dead and close it. Combined with the 60-second interval, this means a dead connection gets detected within 3 minutes.

How It Feels

Before ControlMaster:

bash
$ time ssh your-server "echo hello"
hello
real    0m1.842s       # almost 2 seconds for "echo hello"

$ time ssh your-server "echo hello"
hello
real    0m1.756s       # same — full connection setup every time

After ControlMaster (second command onward):

bash
$ time ssh your-server "echo hello"
hello
real    0m0.089s       # near-instant — reusing the master connection

Twenty times faster. And more importantly, it never gets rate-limited.

Managing the Master

You'll rarely need to manage the master connection manually, but when you do:

bash
# Check if a master connection is alive
ssh -O check your-server

# Kill the master connection (all shared sessions will close)
ssh -O exit your-server

# Start a new background master
ssh -MNf your-server

The -MNf flags mean: -M force master mode, -N don't execute a remote command, -f go to background. This creates a persistent master connection that sits there doing nothing except keeping the tunnel alive.


Proxy Setup (Conditional)

Skip this section if your server can directly access external services like HuggingFace and PyPI. Not sure? Run ssh your-server 'curl -s https://huggingface.co | head -1' and see if you get HTML back. If you do, skip ahead to the checkpoint.

The Problem

Your server is behind a firewall. Or your university's network blocks outbound connections to certain ports. Or your cloud provider's VPC doesn't have a NAT gateway. Whatever the reason, when you SSH into the server and try to download a model:

bash
$ pip install transformers
ERROR: Could not find a version that satisfies the requirement transformers

$ huggingface-cli download meta-llama/Llama-3-8B
ConnectionError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded

The server simply can't reach the internet. Every package installation, every model download, every dataset fetch — dead on arrival.

The Solution: SSH RemoteForward

Here's the idea. Your local machine has internet access — maybe directly, or maybe through a local proxy running on port 7890 (Clash, V2Ray, whatever you use). SSH has a feature called RemoteForward that creates a tunnel: it takes a port on the server and forwards it to a port on your local machine. Anything the server sends to its local port 10808 gets tunneled through the SSH connection to your local machine's port 7890, which has internet access.

Add this to your ~/.ssh/config on your local machine:

Host your-server
    HostName your.server.ip
    User your-username
    RemoteForward 10808 127.0.0.1:7890

Now when you SSH into your-server, port 10808 on the server automatically points to port 7890 on your local machine. The server doesn't need any special configuration — it just needs to know to use 127.0.0.1:10808 as its proxy.

On the server, set the proxy environment variables:

bash
export http_proxy=http://127.0.0.1:10808
export https_proxy=http://127.0.0.1:10808

That's it. Now pip install, huggingface-cli download, curl, and everything else that respects standard proxy environment variables will route through your local machine's internet connection.

Why ControlMaster Matters Here

There's a critical constraint with RemoteForward: only one SSH connection can hold a given port. If you SSH into your server twice, the second connection tries to bind port 10808 and fails — because the first connection already has it.

This is exactly why we set up ControlMaster first. With ControlMaster, the master connection holds the RemoteForward port, and every subsequent connection shares the master. No port conflicts. One tunnel, many users.

Without ControlMaster, you'd get this on every second connection:

Warning: remote port forwarding failed for listen port 10808

And you'd never be sure which connection actually has the working tunnel.

Where to Set Proxy Variables

You need the proxy environment variables set inside any shell session that needs internet access. But don't add them to .bashrc. Here's why: if your server sometimes has direct internet access (e.g., you set up the proxy later, or the firewall rules change), having proxy vars in .bashrc means every session routes through the tunnel — even when it's unnecessary or when the tunnel isn't active.

Instead, set them explicitly when you need them:

  • In tmux sessions for training: add export http_proxy=... and export https_proxy=... at the top of your launch commands
  • For one-off commands: prefix them: http_proxy=http://127.0.0.1:10808 pip install transformers

This keeps things explicit and avoids the mysterious "why is my connection so slow" debugging session three months from now when you've forgotten about the proxy.

Troubleshooting

If curl https://huggingface.co stops working from the server after it was working earlier, walk through this checklist:

1. Is the master SSH connection still alive?

bash
# Run this on your LOCAL machine
ssh -O check your-server

If it says "No ControlPath" or "No such file," the master died. Reconnect:

bash
ssh -MNf your-server

2. Is your local proxy running?

If your local machine routes internet through a proxy on port 7890, make sure that proxy is actually running. The SSH tunnel forwards to port 7890 — if nothing's listening there, the tunnel connects but can't reach the internet.

3. Is the port actually forwarded?

SSH into the server and check:

bash
curl -x http://127.0.0.1:10808 -s https://huggingface.co | head -1

If this returns HTML, the tunnel is working. If it hangs or errors, go back to step 1.

4. Did another SSH connection steal the port?

If you disabled ControlMaster or connected from a different machine, another connection might have tried to bind port 10808 and failed silently. Kill all connections and re-establish a clean master:

bash
# On your local machine
ssh -O exit your-server 2>/dev/null
ssh -MNf your-server

Checkpoint

Test that your server can reach the internet.

If your server has direct internet access (no proxy needed):

bash
ssh your-server 'curl -s https://huggingface.co | head -1'

If you set up the proxy:

bash
ssh your-server 'curl -x http://127.0.0.1:10808 -s https://huggingface.co | head -1'

You should see HTML output — something starting with <!DOCTYPE html> or similar. If you do, your server can reach the outside world. Models can be downloaded. Packages can be installed. You're ready to set up Claude Code.

Also verify that ControlMaster is working:

bash
# Open one connection
ssh your-server "echo 'first connection'"

# Check that a master socket exists
ssh -O check your-server

You should see Master running with a PID. Every future SSH command to this server will be near-instant.

Released under the MIT License.