Honestly, this is one of those areas where the gap between “technically possible” and “practically doable in a sustainable, secure way” is a lot wider than it seems on the surface.
You’re absolutely right that the WiFi calling stack—especially the ePDG and IKE components—does some end-running around userspace VPNs and Private DNS. That’s by design on Google’s part, since WiFi calling is meant to be a carrier-grade service that needs to establish a tunnel before the OS has even fully decided which network is “primary.” From AOSP’s perspective, it’s treated almost like a separate radio interface.
The challenge with trying to force that traffic through a userspace VPN isn’t just about finding the right hooks in IwlanDataService or EpdgTunnelManager. It’s that the WiFi calling daemons and related components run with elevated privileges and expect to bind to specific physical interfaces. Once you start redirecting that traffic, you risk breaking things like:
Emergency calling — which is rightly treated as a critical path that the OS will fight to keep working, even if it means bypassing your VPN.
Race conditions at boot — the ePDG tunnel often tries to establish before VPN apps are even running, especially if the SIM is locked or the device is freshly booted.
Double NAT and MTU issues — IPSec over WireGuard over mobile data can get fragile fast, and when it fails, users blame GrapheneOS for “broken calling,” not the underlying complexity.
The other piece is maintenance. Even if someone patches this today, every Android release (especially QPRs) touches connectivity stacks in ways that aren’t always well-documented. A patch that works in one version can silently break in the next, and because this touches low-level networking, the failure modes can be subtle—like calls silently failing to ring, or the phone falling back to unencrypted signaling without the user knowing.
That’s not to say it shouldn’t be done. I think a lot of us in the privacy community would love to see WiFi calling traffic forced through the VPN. But from the GrapheneOS team’s perspective, they tend to avoid changes that add significant long-term maintenance burden unless there’s a clear path to keeping it stable across updates. They’ve also historically been cautious about anything that could impact call reliability, since for many users, that’s a non-negotiable part of daily phone use.
If you’re serious about experimenting with this, one approach might be to prototype it as a separate mod or even a userspace daemon that runs alongside the OS, rather than trying to merge it into GrapheneOS directly. That way you could prove out the stability and edge-case handling without asking the team to commit to maintaining it long-term.
It’s a frustrating limitation, no doubt, but I think the hesitation isn’t because it’s impossible—it’s because doing it right in a way that doesn’t cause regressions is genuinely hard.