stwu rS, d(rA)
-->
[rA+d] = rS
rA = rA-4
(could be that I mix up the order - could also lower rA before, brkirch will probably know ^^)
The register is changed after the value has been stored so that much is right, but actually it does not do rA = rA-4 but instead rA = rA+d.
Basically stwu is normally used on rA = r1 - r1 is the stack register, and thus you can store registers on the stack!
To elaborate on that, the idea is that stwu will allow for the stack pointer to be stored AND changed with the same instruction. If this was not possible then there would be a time between storing the stack pointer and changing it where the stack is corrupted; during that time an interrupt handler could try to use the stack and bad things could happen (crash, memory corruption, etc.).