Message from @M4Gunner
Discord ID: 429489320915763218
the line correspondence is trippy here
This is x86-64, so it's using the MMX/SSE registers for the floating point.
Call convention allows the arguments to come in as registers instead of the stack.
xmm0, xmm1, xmm2
First 3 lines multiplies each register by itself, which squares the value.
Then 2 lines to add them.
Then `pxor` a register with itself, it always generates a zero. It's the fastest way to load a zero into a register, in Intel/AMD processors.
okay so in the beginning he doesnt manually move your variables into the first 3 registers
I'm not too familiar with amd64/x86-64 ABI, I'm guessing it can assume the first few arguments are both on the stack and on the XMM registers.
ok
had to look this part up
https://c9x.me/x86/html/file_module_x86_id_180.html
Most of the other commands are somewhat familiar
it's using single-precision floating point, so if you typed in different numbers it may have automatically chosen a format with more precision?
That's governed by how basic types arithmetic works in C++.
Replace the `float` by `double`, and call `::sqrt()` to see it use different instructions.
The `pxor`, `ucomis` and `ja` serve to check if the argument is positive; if so, it can just use the `sqrt` instruction; otherwise, it needs to call the `sqrt()` function from the standard library, which handles all the nasty NaN, Infinity, negative arguments.
okay. so the chip has its own primitive math
Yeah, it's a CISC architecture.
Switch to the MIPS gcc, and you'll get a very different result.
ive only read a couple pages about MIPS so far
Reduced Instruction Set Computer, the whole point is to have so few instructions in the architecture that the circuitry is very small.
Being small means there's less need for synchronization, thus it can run faster, and there are more transistors that can be used for caches.
PowerISA stands for Performance Optimized With Enhanced Reduced Instruction Set Computer Instruction Set Architecture
So a typical RISC arch won't have any advanced instructions. No specialized math, no instructions that mix register operands with memory, etc.
they dont use the same design pillars when picking acronyms
If course, at some point, you end up with extra room in the silicon, so some complex instructions sneak back in, just because they can.
so all risc programmers need to have all their maths in standard libraries?
It's not like there's a circuit that does math functions in Intel chips. It also runs some software to calculate it.
It's just that it's built into the chip.
Downside is, if the manufacturer didn't pay much attention to details, you can get bad results. Fast, but wrong.
what do you call that? firmware? embedded process??
That would be the CPU's microcode.
ok
Intel CPUs were notorious for having bad trig instructions, when outside the normalized range.
"Bad" meaning they didn't calculate all the bits they promised.
I remember seeing a paper a while back, about how those math functions in CPUs had some unexpected precision problems that didn't even match what the manual promised.
IIRC, for sine/cosine, Intel uses a lookup table, then interpolates the values.
wild guess, did this come to light during the early 3D era?

