.long instruction as string with C2/C0

Started by Deathwolf, February 27, 2011, 06:47:17 PM

Previous topic - Next topic

smoere

would you care to elaborate on why you need to do this from within the game code?

Deathwolf

#16
Ok if you have the codetype 48, why are you using ASM? 48 works fine too!
If you have codetype 04, why are you using 00,02,04,06? Because it's a smaller code? lol

I don't see the real reason to do this.
lolz

Y.S.

QuoteString:
lis r0,0x8012
ori r0,r0,0x2340
lis r3,0xFFFF
ori r3,r3,0xFFFF
stw r3,0(r0)

lis r5,0x8012
ori r5,r5,0x2344
lis r9,0x4200
ori r9,r9,0x6666
stw r9,0(r5)

lis r10,0x8012
ori r10,r10,0x2348
lis r11,0x16FD
ori r11,r11,0xCCF7
stw r11,0(r10)

lis r12,0x8012
ori r12,r12,0x234C
lis r14,0x3F80
ori r14,r14,0x0020
stw r14,0(r12)

Writing a string of data using ASM can be done like this;
[spoiler].set base,0x80122340

.set data,3
.set address,4
.set offset,5

.set counter,12
.set anchor,12

stwu   r1,-0x80(r1)
stmw   r2,8(r1)
mflr   r2

li   counter,(_DataEnd - _DataStart)/4
mtctr   counter

bl   _DataEnd
_DataStart:
.long 0xFFFFFFFF
.long 0x42006666
.long 0x16FDCCF7
.long 0x3F800020
_DataEnd:

mflr   anchor
lis   address,base@h
ori   address,address,base@l
li   offset,0

_Loop:
lwzx   data,anchor,offset
stwx   data,address,offset
addi   offset,offset,4
bdnz+   _Loop

mtlr   r2
lmw   r2,8(r1)
addi   r1,r1,0x80[/spoiler]

dcx2

#18
Y.S.

I don't think you can use r2 to cache the LR.  r2 is a reserved register that points to a read-only data area for things like constants and globals.

Also, you can replace the lwzx/stwx with lwzu/stwu.  This would allow you to optimize away the li offset and addi offset, but you need to un-offset the anchor so the lwzu points to the right place.  Fortunately, we can un-offset the base during compilation.  It will also execute slightly faster.

This requires your data to be word aligned.  If you want byte alignment, you'd have to rewrite it with lbzu/stbu instead.  If you know your data is double-word aligned, you can switch to psq_lu and psq_stu for a somewhat pointless increase in speed.

Also, to prove a point about using these commands to insert data into your code, I switched it up a bit.  Now, it loads a variety of data types; one word, four bytes, the float 1.0 and a float approximation of pi.  The assembler will automatically convert all those values to hex.

[spoiler].set base,0x80122340 - 4

.set LR_save_reg,14
.set data,15
.set address,16
.set offset,17

.set counter,18
.set anchor,18


stwu r1,-80(r1)
stmw r14,8(r1)
mflr   LR_save_reg

li   counter,(_DataEnd - _DataStart)/4
mtctr   counter

bl   _DataEnd
_DataStart:
.long 0xFFFFFFFF
.byte 16, 99, 127, 255
.float 1.0
.float 3.14159

_DataEnd:

mflr   anchor
subi anchor, anchor, 4
lis   address,base@h
ori   address,address,base@l

_Loop:
lwzu   data,4(anchor)
stwu   data,4(address)
bdnz+   _Loop

mtlr   LR_save_reg
lmw   r14,8(r1)
addi   r1,r1,80[/spoiler]

Note the compiled form

[spoiler]C0000000 0000000B
9421FFB0 BDC10008
7DC802A6 3A400004
7E4903A6 48000015
FFFFFFFF 10637FFF
3F800000 40490FD0

7E4802A6 3A52FFFC
3E008012 6210233C
85F20004 95F00004
4200FFF8 7DC803A6
B9C10008 38210050
4E800020 00000000[/spoiler]

Bolded is the string data.  Compare this to

06122340 00000010
FFFFFFFF 10637FFF
3F800000 40490FD0


EDIT typo

Y.S.

I assumed that r2 can be used as long as it is pushed on the stack (haven't actually tested yet). Thanks for pointing that out.

And about the optimization, I think your idea of using lwzu/stwu in a loop is a more elegant way to copy a block of data, but what I don't understand is the registers' assignment; Are there any guidelines to follow when you choose which register(s) to use in the code?
In other words, why did you choose r14~r18? 


dcx2

We typically recommend pushing everything from r14 up onto the stack.  r14-r31 are all non-volatile registers that are safe to use when their contents are backed up on the stack.  At that point, you can use any of the registers, so I just started with r14 and worked my way up each time I needed a new register.'

A C0 code's guidelines are different because it runs in the context of the code handler.

For C2 codes, I made a post recommending guidelines for choosing registers.  Most of the time, you don't need to create a stack frame for a C2 code if you pick your hook right.

http://wiird.l0nk.org/forum/index.php/topic,6555.0.html

Deathwolf

Thank you guys, that is what I'm looking for but I never used these instructions.
Could you explain this a little more please?
lolz