Cascades Theory

Cascade transport: MTU, sockopt, why it breaks

The most agonizing cascade failure is when it seems to work but crookedly: Google opens but Cloudflare doesn't; the site loads but the file won't download. This isn't random, it's transport physics — MTU, nested wrappers, egress IP binding. I'll break down why each of these symptoms arises and by what sign to recognize it. No configs, pure mechanics.

This material is about engineering your own infrastructure and is educational in nature. Complying with the laws of your own jurisdiction is on you.

Why a cascade breaks "halfway"

A direct node either works or it doesn't. A cascade can break partially, and that's exactly what's confusing. The reason is that each hop is another wrapper around the packet, and wrappers aren't free: they eat up space, accumulate latency, and multiply the chance that somewhere along the way something won't fit or hangs. Three classic symptoms — MTU collapse, nested vision, and whimsical sendThrough — give a different picture each, and by the picture you can make a diagnosis without guessing.

MTU: why some sites go and others don't

MTU is the maximum packet size that fits into the channel without fragmentation. Each cascade wrapper (Reality, the transport, another hop) adds its own header bytes, and the usable space in the packet shrinks. Even on two hops the packet is compressed noticeably.

Now the key observation, which is the diagnosis. Different sites behave differently not by chance, but by the size of their TLS handshake:

Google, YouTube, Telegram — short certificate chains, a small handshake. It fits into the shrunken packet and goes.
Cloudflare, Instagram, GitHub — long certificate chains, a large handshake. It doesn't fit — and the connection breaks at the handshake itself.

If the client has "Google works, Cloudflare doesn't," and the split goes by site, not randomly — it's almost certainly MTU. Not the transport, not a block, not the keys. Large packets just can't squeeze through the channel narrowed by the cascade.

There's also a second, more insidious variety — the PMTU "black hole." This is when small packets get through and large ones are silently dropped somewhere on the route (the provider doesn't send the required "packet too big"). Then the site opens (small requests went through), but the download hangs (large packets go nowhere). The symptom "opens but won't download" — that's it.

MTU is cured by one class of techniques — reduce the maximum segment so the packet fits with room to spare into the narrowest part of the path. At the transport level it's sockopt with a segment limit (usually around 1280–1360 bytes), at the kernel level it's an MSS clamp on the nodes. The essence is the same: you shrink the packet yourself in advance so it doesn't hit the channel's ceiling mid-route. After that the "disobedient" sites with a large handshake come back to life.

Nested vision: why a triple cascade is capricious

The second source of breaks is a specifics of XTLS Vision itself, the flow that a Reality node often applies automatically. Vision optimizes transmission, "splices" streams, and this works great on one layer. But if you wrap vision inside vision inside vision — three hops in a row, each with its own vision — the wrappers start to conflict.

The mechanics of the failure: the handshake and small responses go through, but any large download stalls after a couple hundred bytes — this is a splicing deadlock, when two vision layers can't agree on who reads whom. Externally indistinguishable from an MTU problem ("opens, won't download"), but cured differently.

Hence the practical rule for multi-hop cascades: keep vision on only one hop, switch the rest to another transport (gRPC), where vision isn't imposed and doesn't nest inside itself. Two vision layers (entry + exit in a two-hop) is still tolerable and proven to work. Three — no longer, HTTPS with a large handshake won't squeeze through. If you're building a triple cascade and it "opens the site, won't download the file" after MTU is already fixed — it's almost certainly nested vision to blame, not the packet size.

sendThrough: stability versus whiteness

The third symptom is subtler and relates to schemes where outbound traffic is bound to a specific IP (that same "the white entry doesn't reach out" technique from the triple cascade). Binding egress to the second address via sendThrough is a beautiful idea, but it has a price.

An observation from practice: the route from the second IP to the foreign node is sometimes unstable in itself, even when an ordinary TCP connect from it goes through. The result — flakiness: half the connections stall, half go, with no obvious pattern. TCP seems to connect, but traffic flows in bursts.

Here an honest fork arises, and you need to understand it in advance:

Priority — the exit IP's whiteness. You keep sendThrough on the second address, tolerating the possible route flakiness.
Priority — stability. You remove the binding, egress goes from the entry IP — the connection becomes steady (observed 16 out of 16 versus ~50% with the binding). The cost: the entry does connect outward after all.

Importantly, even without sendThrough the outbound from the entry is wrapped in Reality and looks like ordinary TLS from the outside — that is, what you sacrifice is not the disguise but specifically that "the white IP is absolutely mute" property. Sometimes stability matters more, and this is a normal engineering compromise, not a defeat.

How to make a diagnosis

Put it all together and you get a decision tree that saves hours:

Split by site (Google yes, Cloudflare no), "opens — won't download" → first MTU (shrink the segment).
MTU fixed, but the download still stalls after a couple hundred bytes on a multi-hop → nested vision (leave vision on one hop, the rest on gRPC).
Flakiness with no tie to sites, ~50% of connections stall in a scheme with egress binding → sendThrough (weigh whiteness against stability).

The general principle: a break in a cascade is almost never random. Randomness in a network looks like randomness; but "these sites yes, those no" or "stalls exactly on the download" are the signatures of specific physical causes. Learn to read the signatures and you stop guessing.

How these techniques look in configs (sockopt, MSS clamp, grpc on relay hops, sendThrough) — in the practice "A triple cascade: entry → relay → exit" and "A two-node cascade + WARP on the exit."

Next guide Geo-splitting traffic: why routing → ↗ Article unclear or something off? Message me and I will help or fix it. @notrealvpn →

This material is educational and covers network-infrastructure engineering. You are responsible for complying with the laws of your jurisdiction.