SBCL now uses the "write-bitmap"/"card marking" scheme on some platforms, which is a tiny bit faster (1-2% mentioned on sbcl-devel). More interesting is that doing the touch-detection in software allows for finer grained precision (e.g. #+mark-region-gc uses 128 byte cards) than hardware (e.g. 4kiB pages on x86-64, 16kiB on M1) which can drastically affect scavenging time [0]. The precision is also really nice for non-moving generational schemes: if old and new objects exist on the same card, writes to new objects (which are more common too!) will cause old objects to needlessly be scanned by GC, which is called "card pollution" by Demers et al [1], so reducing the card size reduces the likelihood of that happening.
Oh neat, I thought it was still unoptimized compared to gengc (which afaik does still use write protection). I'll have to check it out, see if I have a machine it'll run faster on
gencgc uses software protection on some but not all architectures -- I recall x86-64 and MIPS but not ARM though. On x86-64 with SBCL 2.3.8 for example:
https://medium.com/@MartinCracauer/generational-garbage-coll...
It's handy for knowing which areas of the heap have been untouched (and thus don't need to be scavenged in a generational gc).