I thought everything would be ported by hand or use high-level languages that would synthesize the SIMD code. The latter being able to make ports to MIPS64 or RISC-V easier later. Nah, yall took a much better route. Amazing.
I don’t know if it’s better but it’s certainly more convenient. I’m not sure I like taking this to its logical conclusion (i.e., every other architecture’s vector unit operations expressed in terms of Intel SIMD and those essentially becoming the standard operations), especially since I know when rewriting the SSE2 VP9 decoder there were quite a few improvements I could make and more efficient AltiVec operations I could use. But it took me weeks to do that work. I “vectorized” KANN in about a minute. That’s pretty hard to argue with.
I see what you’re saying. I can’t find my SIMD languages in a quick search. Prolly lost the bookmarks. Did find Sierra that let you express it in C++. Might do another search for SIMD languages or whatever in near future.
packed_simd, simdeez, ispc… they don’t quite have the word “language” attached to them, but they are portable SIMD abstractions.
(fun story about how ISPC ended up with a NEON backend)
Thanks. The story was another good reminder of why companies better revoke access after employees leave.
So the power equivalent of sse2neon is literally part of gcc?? uh.. interesting.
I wonder how far these reimplementations of one SIMD ISA in terms of another can go with the more modern stuff. Everyone does the more complex table/shuffle/etc ops quite differently…
It is possible to go a lot further than this.
Revec is an LLVM optimization pass that optimizes vector intrinsics. It can optimize SSE2 intrinsics to AVX2, or AVX512.
Automatic Algorithm Recognition and Replacement discusses a system for identifying and optimizing algorithms. The work is not theoretical - the Convex Application Compiler shipped in 1992.
With a compiler pass, of course anything is possible :)
I was asking more about the “just a C header” approach.