Thanks, this looks quite promising! As noted in other threads, both a version for the plain NEON instructions would be good (and the intermediate extensions, rdm, dotprod, i8mm etc), and also SVE2 instructions.
Edit: Oh, I see this does have SVE2 as well - that's great!
The ARM ARM is quite heavy to browse; for baseline NEON, I've used the "ARMv8 Instruction Set Overview" [1] which comes in a a neat 115 pages, which is great for easy browsing and finding what's available. But for anything newer than the baseline, one pretty much has to refer to the ARM ARM right now.
I hope it's useful, though I think most people care about the "Advanced SIMD" (Neon) instructions, which I'd also like to do. I started with SVE because I wasn't already familiar with it, so it was a more interesting project.
(For anyone unfamiliar, SVE is supported only on extremely recent ARM CPUs, and Apple CPUs do not yet support it, whereas AdvSIMD is available on all ARMv8-A CPUs.)
Heh, I'd have called it the Arm SIMD Instruction List, but Arm have been aggressively enforcing the Arm trademark [1], so I settled for A64 (the official name for the instruction set [2]).
Scalable because this one can work in a vector length agnostic way. That is, you can write code that works with 128 bit vectors and also run on a machine with 512 bit vectors without recompilation.
Edit: Oh, I see this does have SVE2 as well - that's great!
The ARM ARM is quite heavy to browse; for baseline NEON, I've used the "ARMv8 Instruction Set Overview" [1] which comes in a a neat 115 pages, which is great for easy browsing and finding what's available. But for anything newer than the baseline, one pretty much has to refer to the ARM ARM right now.
[1] It used to be available from ARM themselves, but I can't seem to find it these days. I've found one copy online at https://www.cs.princeton.edu/courses/archive/spr19/cos217/re... though.