← Back to library
Cascades Theory

Balancers and auto-select: how it works

"Auto-select" sounds simple, but in reality it's four different levels and at least five strategies that optimize different things. And one of them, the most popular, predictably breaks under TSPU. I'll break down where balancing lives, how the strategies differ, and why leastPing "works for a couple of minutes, then stalls." No configs.

Go to practice

This material is about engineering your own infrastructure and is educational in nature. Complying with the laws of your own jurisdiction is on you.

Two jobs of balancing

Balancing solves not one job but two at once, and it helps to keep them separate in your head:

  1. Distribute the load — so clients don't crowd onto one node while others sit idle.
  2. Survive a failure — so a block or the death of one node doesn't take the service down.

Different mechanisms solve these jobs differently, and the client-side "auto-select" is only one of the levels. There are four in total, from the client to the transport front.

Four levels

Level 1 — Xray balancers (server-side). Inside a single Xray, several exit outbounds are assembled into a pool, and a routing rule sends traffic not to a specific outbound but to a balancer, which picks a pool member itself. This is what makes a "node fan out to several exits."

Level 2 — a client-side balancer via the subscription. Several servers are put into the subscription, and the client itself tests them and picks the fastest (the canon is sing-box urltest, in Hiddify it's "Auto"). The downside: there's no seamless failover mid-connection — on a drop the client retests and reconnects.

Level 3 — a transport front (HAProxy/nginx). An L4 balancer in front of the nodes that routes by SNI, without decrypting TLS (Reality stays intact). The most reliable for a commercial service. An important limitation: this SNI-based approach balances only TCP — Hysteria2 (QUIC/UDP) can't be spread this way.

Level 4 — DNS round-robin / GeoDNS. One domain → several A records. Works for any protocol, but crudely: no health check (a dead node stays in rotation until you edit DNS), and DNS is also spoofed/blocked on top of that. Only as a last resort and for crude geo-direction, not as failover.

Four strategies of the Xray balancer

Inside the server-side balancer, the choice of a pool member is determined by the strategy, and there are four:

  • random — at random. The default, fine for homogeneous nodes.
  • roundRobin — in a circle, evenly.
  • leastPing — the lowest ping. It needs an observatory that periodically pings the pool members.
  • leastLoad — the most stable under load. It looks not at a single probe but at the latency distribution (baselines).

The difference between the last two isn't cosmetic — it's the root of the main problem of all "auto-selects."

Why leastPing "works → stalls"

This is the most common complaint, and it isn't random — it's baked into the design. I break it down step by step, because understanding here is worth more than any ready config:

  • leastPing picks the pool member with the lowest ping. The pool usually contains bare raw-TCP+Reality+Vision — it has the minimal framing, hence the minimal ping → leastPing always picks exactly it.
  • On this transport TSPU throttles the bandwidth — it doesn't tear the connection, it squeezes the speed to almost zero.
  • Meanwhile the observatory pings a tiny generate_204 once every few dozen seconds — and this small probe passes even on a throttled channel. To the balancer the member looks healthy.
  • The result: leastPing never demotes this crushed raw member, because by the probe it's "fast." The real speed is zero. "Works → stalls."
  • Restarting the app helps by chance: a new connection settles on a still-alive transport.

Measurement confirms it: the same key, a series of connect→speedtest→hold probes — vision-on-raw succeeds about a third of the time, gRPC almost always. From here come two ideas that cure the balancer.

What to do about it (ideas, not commands)

  • Remove raw+Vision from the pool — leave gRPC/xHTTP members. raw survives a block worst of all.
  • leastLoad instead of leastPing — it looks at the latency distribution rather than a small probe, and spreads wider, not sticking to one crushed member.
  • Direct nodes in the default pool, the cascade in reserve. A cascade throttles upload; for the main pool direct nodes are faster, and the cascade is kept as a bypass/fallback.
  • Block QUIC before the balancer rule, or video sticks in the tunnel.

What to choose for a VPN service

The strategies optimize different things, and there's no universal one:

  • Failover (HAProxy/nginx backup) — about uptime: primary nodes plus a hot reserve, with health checks.
  • Least-conn — about long sessions: VPN connections are long, round-robin overloads the "lucky one," while least-conn distributes by the number of active connections.
  • Least-ping — about speed for a user in a multi-region setup, but with the caveat about stalling above.
  • Least-load — about stability under load and bursts on mixed-caliber nodes.
  • Source-hash — about stickiness: keeps one client's TCP and UDP on one node.

A working combination for a commercial service: an HAProxy/nginx front by SNI + least-conn + backup nodes (failover and long sessions), on top a client-side urltest (the user takes the fastest region), and Xray leastLoad — for when one node fans out to several exits.

About node health

One trap you need to know in advance: the server-side observatory pings from the node itself, and a node can be alive to the world but cut off for Russian clients. The observatory won't see such a problem. So health must be checked from Russia, by a separate external worker, not only globally. That's a separate topic — moved to the healthcheck practice.

Next — in the practice: how to assemble a production leastPing/leastLoad auto-select, how the pool is defined by host tags, and the main template delivery traps. Step by step in "leastPing auto-select: building it."

Next guide leastPing auto-select: building it → Article unclear or something off? Message me and I will help or fix it. @notrealvpn →
This material is educational and covers network-infrastructure engineering. You are responsible for complying with the laws of your jurisdiction.