Background
I am often in locations thousands miles away from the Server, to which I must unidirectionally stream data over UDP via a travel Router. Duplicate data is fine, but latency is not. Each sent packet must make it to server as quickly as possible, whatever the bandwidth cost. I’m happy to spend 10x the bandwidth if it wins me even just 3 ms on average.
To that end, I’m experimenting with duplicating packets on the same interface and changing the dest port from 443 to 8443. Since the destination is listening on both, the two packets are to race against each other.
Roughly, this:
+--------+
UDP packet eth1 | | phy0-sta0 (dport 443)
from client ----->-| Router |--->--------------------> Server
dport 443. | | phy0-sta0 (dupe, set dport 8443)
+--------+ ---------------------> Server
You can think of it as a homegrown approach to Peplink’s WAN smoothing or Speedify’s redundant mode, especially as in the future, I may want to dupe to the other antenna on the router, phy1-sta0, and/or also to eth0 [if wired into WAN directly], but we can keep it simple for now.
Attempted approach
There are a lot of related questions, e.g. 1. iptables / nftables: Forward UDP data to multiple targets, 2. Need to duplicate UDP packets to multiple destinations via iptables, 3. Setting up UDP packets to two different destinations using iptables and PREROUTING, 4. How do you duplicate all UDP traffic on a port range using nftables, 5. nftables: duplicate broadcast packets between segments, 6. nftables: duplicate UDP packets for specific destination IP:port to a (second) destination IP:port.
Ultimately, this answer by A.B (who’s been very helpful across many of these Qs) seemed like the best fit for my circumstances (OpenWRT 23.05.4, Kernel 5.15.162), so I adapted it like so:
nft -f - <<'EOF'
table netdev t_dup # for idempotency
delete table netdev t_dup # for idempotency
add table netdev t_dup {
chain c_ingress {
type filter hook ingress device "eth1" priority filter; policy accept;
iif eth1 udp dport 443 meta mark != 1 meta mark set 1 dup to eth1 udp dport set 8443
}
}
EOF
In other words, catch any packet entering on eth1 with dport 443. Mark it to prevent loops. Duplicate it back to eth1 and set the dport of the original packet to 8443.
The above actually worked just fine when I tested netcat-listening on UDP 443/8443 on lo
and sending a dport 443 payload down lo
. Both 443 and 8443 listeners got it, as intended. But seems the trouble starts as soon as we involve an interface that is meant to route the packet further, the trouble begins.
Issues
The duped packet goes in reverse. Consider this case, where a client connected to the Router sends a single UDP packet to 1.1.1.1
. (note, eth1
is bridged via br-lan
as there are multiple eth ports, which could be relevant.) Without duping/rewriting dport, things work just fine, eth1 -> br-lan -> phy0-sta0 -> internet
. But with the dupe, the original packet rewritten to dport 8443 goes out right, but the dupe tries to egress from eth1
:
root@Router:~# tcpdump -n -i any '(udp and (port 443 or port 8443)) or (host 1.1.1.1)'
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
21:19:24.274700 eth1 In IP 192.168.3.189.59598 > 1.1.1.1.443: UDP, length 17
21:19:24.274726 eth1 Out IP 192.168.3.189.59598 > 1.1.1.1.443: UDP, length 17
21:19:24.274700 br-lan In IP 192.168.3.189.59598 > 1.1.1.1.8443: UDP, length 17
21:19:24.274842 phy0-sta0 Out IP 192.168.170.228.59598 > 1.1.1.1.8443: UDP, length 17
I thought the issue is around source/dest MAC addresses. Perhaps they get flipped, and the packet is sent the reverse way? But dumping across all 3 interfaces, I do not see anything wrong (possible tcpdump isn’t recording these right?) with the MACs. I’ve search-and-replaced the addresses with the matching labels, and as you can see, on eth1, both the original and the dupe seem to have the correct direction of source/dest MAC addr:
root@Router:~# tcpdump -e -n -i eth1 '(udp and (port 443 or port 8443)) or (host 1.1.1.1)' -vvv
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
21:22:30.239244 CLIENT_MAC_ADDR > ETH1_AND_BR_LAN_MAC_ADDR, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 3341, offset 0, flags [none], proto UDP (17), length 45)
192.168.3.189.50858 > 1.1.1.1.443: [udp sum ok] UDP, length 17
21:22:30.239270 CLIENT_MAC_ADDR > ETH1_AND_BR_LAN_MAC_ADDR, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 3341, offset 0, flags [none], proto UDP (17), length 45)
192.168.3.189.50858 > 1.1.1.1.443: [udp sum ok] UDP, length 17
root@Router:~# tcpdump -e -n -i br-lan '(udp and (port 443 or port 8443)) or (host 1.1.1.1)' -vvv
tcpdump: listening on br-lan, link-type EN10MB (Ethernet), snapshot length 262144 bytes
21:22:30.239244 CLIENT_MAC_ADDR > ETH1_AND_BR_LAN_MAC_ADDR, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 3341, offset 0, flags [none], proto UDP (17), length 45)
192.168.3.189.50858 > 1.1.1.1.8443: [udp sum ok] UDP, length 17
root@Router:~# tcpdump -e -n -i phy0-sta0 '(udp and (port 443 or port 8443)) or (host 1.1.1.1)' -vvv
tcpdump: listening on phy0-sta0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
21:22:30.239386 PHY0_STA0_MAC_ADDR > GATEWAY_MAC_ADDR, ethertype IPv4 (0x0800), length 59: (tos 0x0, ttl 63, id 3341, offset 0, flags [none], proto UDP (17), length 45)
192.168.170.228.50858 > 1.1.1.1.8443: [udp sum ok] UDP, length 17
What happens if we dupe on br-lan
? I’ll spare you the three-interface dump as the picture is the same, here’s the dump on any
:
root@Router:~# tcpdump -e -n -i any '(udp and (port 443 or port 8443)) or (host 1.1.1.1)'
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
21:25:58.479889 eth1 In ifindex 3 CLIENT_MAC_ADDR ethertype IPv4 (0x0800), length 66: 192.168.3.189.53620 > 1.1.1.1.443: UDP, length 17
21:25:58.479889 br-lan In ifindex 7 CLIENT_MAC_ADDR ethertype IPv4 (0x0800), length 66: 192.168.3.189.53620 > 1.1.1.1.443: UDP, length 17
21:25:58.479925 br-lan Out ifindex 7 CLIENT_MAC_ADDR ethertype IPv4 (0x0800), length 66: 192.168.3.189.53620 > 1.1.1.1.443: UDP, length 17
21:25:58.479935 eth1 Out ifindex 3 CLIENT_MAC_ADDR ethertype IPv4 (0x0800), length 66: 192.168.3.189.53620 > 1.1.1.1.443: UDP, length 17
21:25:58.480033 phy0-sta0 Out ifindex 8 PHY0_STA0_MAC_ADDR ethertype IPv4 (0x0800), length 65: 192.168.170.228.53620 > 1.1.1.1.8443: UDP, length 17
So, the duping injected the packet into br-lan
, but again as outgoing, and then it egressed all the way out to eth1
and attempted to go out to internet, failing. The original, with dport rewritten to 8443, went out successfully.
I am very likely missing something basic. What could it be? What determines whether the interface treats a packet as ingressing or egressing? You’d think dupe-ing on ingress would result in an ingress packet.
More context
If it’s helpful:
...
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br-lan state UP group default qlen 1000
link/ether ETH1_AND_BR_LAN_MAC_ADDR brd ff:ff:ff:ff:ff:ff
7: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether ETH1_AND_BR_LAN_MAC_ADDR brd ff:ff:ff:ff:ff:ff
inet 192.168.3.1/24 brd 192.168.3.255 scope global br-lan
valid_lft forever preferred_lft forever
inet6 fd8d:de60:8b97::1/60 scope global noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::9683:c4ff:fe4c:8c30/64 scope link
valid_lft forever preferred_lft forever
8: phy0-sta0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether PHY0_STA0_MAC_ADDR brd ff:ff:ff:ff:ff:ff
inet 192.168.170.228/24 brd 192.168.170.255 scope global phy0-sta0
valid_lft forever preferred_lft forever
inet6 fe80::9683:c4ff:fe4c:8c31/64 scope link
valid_lft forever preferred_lft forever
root@Router:~# ip route show
default via 192.168.170.139 dev phy0-sta0 proto static src 192.168.170.228
192.168.3.0/24 dev br-lan proto kernel scope link src 192.168.3.1
192.168.170.0/24 dev phy0-sta0 proto kernel scope link src 192.168.170.228