Here is the rewritten single-header library that wraps most (?all?) MTE functionality, mentionned earlier by @flawedworld.
I'll address a few points regarding what has been said in this thread quickly:
- Intrinsics
When I started working on it, I don't think intrinsics were readily available, or at least I wasn't able to find them. For this reason I used raw ASM instead (which causes other issues down the line but I digress).
I fully agree that intrinsics should be used if possible, and will take a look at them more closely.
However, some optimizations I do in MTELib might not be possible without assembly (e.g. this); though I haven't done any profiling so maybe the gain is insignificant and this point is moot. - The do/while in pointer tagging functions
This comes from a misunderstanding on my side on how the random tag generation worked (I thought that all bits in the exclude mask were excluded in the random gen'ed tag, when I should have understood that bit X in exclude mask prevent tag X from being generated). This issue is addressed in MTELib. - DC GVA/DC GZVA
I wasn't aware those existed as they're not mentioned in the MTE whitepaper.
From a quick glance in the AArch64 Instructions documentation, it looks like they might generate exceptions (though it looks related to hypervisor so maybe it's just hyp traps?). I will look into this more closely later.
I will try to integrate this library in hardened_malloc in a similar fashion to what I currently have on GitHub in the next few days.