Code type explanation

Started by daijoda, October 04, 2011, 03:00:15 AM

Previous topic - Next topic

daijoda

1. How do you store the current pointer address? If offset 0x81000000 holds 80000010, and

4A000000 91000000
14______ ________
4C000000 81000000

Does offset 0x81000000 now hold 91000000?
Or does offset 0x80000010 hold 91000000?

I don't understand what this means:
46000000 00000000 = ba points to next code's first 32bits.
46000004 00000000 = ba points to next code's second 32bits.

For this code:
46000004 00000000 # line 1
04101010 00000001 # line 2
04202020 00000002 # line 3
E0000000 80008000 # line 4
04000000 00000003 # line 5

Which offsets do lines 2, 3 and 5 write to?

2. How do I create a branch? For example, I want to insert @0x80300000 the following:

addi r0,r0,1
stw r0,0(r31)

# then branch to code_A @0x81300000

code_A:
lwz r0,0(r31)
subi. r0,r0,1

# branch if equal to code_B @0x90000000
# go back to next original instruction (@0x80300004) if not equal

What would these branches look like?

3. Can you write to a gecko register's address using something like this:

Assume grN is @ 0x80001808, then

04001808 xxxxxxxx

or

lis r1,0x8000
ori r1,r1,0x1808
li r2,x
stw r2,0(r1)

Thanks for reading, sorry for asking so many questions.

Stuff

#1
Lets see if I'm good at explaining stuff

4A000000 91000000 //po = 91000000
14______ ________ //4byte write at 91______
4C000000 81000000 //store pointer address at 81000000. so [81000000] = 91000000 if I'm not mistaken. Probably useful for changing a pointer in a table or something. I think it's trash.

46 codetype: use 4E instead. I never understood the reasoning in it, but it just feels better to only change the pointer and not the base address. 46 is for base address, 4E is for pointer. dcx2 has some good examples of 4E being a beast. The pointer is the next code's address +XXXX. XXXX is signed, so >7FFF is negative. FFFF being -1. It's good for self modding codes and stuff. I think I used it before just for fun. Your code:

46000004 00000000 //ba = the next line's address
04101010 00000001 //[this line's address(the 2nd 32 bits) +101010] = 00000001
04000000 00000002 //(changed so you can see it better) that 00000001 on that last line becomes 00000002.
E0000000 80008000 //terminator
04000000 00000003 //[80000000] = 00000003

4E does come in handy. Probably a good way to shorten your roller. Sorry I can't provide any good examples.Found a nice example from dcx2: http://wiird.l0nk.org/forum/index.php/topic,8671.msg72146.html#msg72146

Well, C2 creates a branch to the code in the code handler and it branches back at the end. You have to end it with 00000000 because that's where the branch is gonna be.

C2300000 00000004 //4 is number of lines to run
addi r0,r0,1 //use pyiiasmh or asm<>wiird to turn them into gecko codes
stw r0,0(r31)
lwz r0,0(r31)//might as well follow it up
subi. r0,r0,1
bne- _END
_CODE_B:
_END:   //include original instruction if needed
00000000 // or 60000000 00000000 if you have to. This will be where it branches back.

you can write to gecko registers if you want. They're some where low in memory. if's, anything you can think of. But there are code types for gecko registers.

Quote from: dcx2 on July 19, 2011, 04:57:04 AM
Yes, you can re-use gr2 to "copy and paste" the pointer after it has been used to R*Q+S.  Your code looks pretty good.

Close.  gr0 = 80001808.  gr1 = 8000180C.  Each gr is 4 bytes.  grF = 80001844.

b0 = 80001848.  b1 = 80001850.  Each block is 8 bytes.  The first 4 bytes are the pointer, the second 4 bytes are...uh...something about repeat.

The code handler owns everything between 80001800 and 80003000 so all of that memory is safe.  The gr and blocks are at the beginning; then the code handler's ASM; and at the end is the code list.
.make Stuff happen.
Dropbox. If you don't have one, get it NOW! +250MB free if you follow my link :p.

Mod code Generator ~50% complete but very usable:
http://dl.dropbox.com/u/24514984/modcodes/modcodes.htm

dcx2

#2
Regarding 4A/4C code types, Stuff is correct about their functionality.  Although I don't think the code type is trash ;)

The 46 code type is useless.  Only the upper 7 bits of the ba are ever used; the lower 25 bits are ignored.

All bits of the po are used, so the 4E code type is actually useful.  However, based on your example question, it seems that you have a misconception.  The codes that you send actually exist in the game's memory.  The specific address that they are at will change depending on how many codes you have and what version of the code handler you use.  The 4E is used to get the address of a code.  The example Stuff linked to is pretty good.

Regarding your example ASM, it will not do what you think it will do.  Some registers have very special meanings.  r0 is one of them.  If you look at the datasheet for the addi instruction http://pds.twi.tudelft.nl/vakken/in101/labcourse/instruction-set/addi.html you will see the notation (rA|0).  In this case, when r0 is used for rA, the actual value 0 and not the contents of r0 are used.

You might be asking WTF use is that?  Well, it's actually really useful.  Suppose you wanted to load the value 7 into register r3.  We would normally write li r3,7.  However, if you look at the op codes you will see that li r3,7 is actually a mnemonic for addi r3,r0,7.  Because 0 is used instead of r0, this essentially means "r3 = 0 + 7".

C2 codes are used to insert more than one ASM instruction, as Stuff has shown.  C2 codes do three things.  1) They provide an address for the ASM to exist at by virtue of existing in the code handler; typically the codes will live at address 80002XXX.  2) C2 codes automatically write a branch to the code at the hooked address.  3) C2 codes automatically write a branch from the end of the C2 code back to the game.

The Gecko Registers do start at 80001808.  Yes, you can write to them directly.  They can also be referenced in a C0 code, typically with r7 although this can vary depending on the code handler version.

However, I would like to mention once again that some registers have special meanings.  r1 is the stack pointer.  You can only use the stack pointer if you follow a very specific convention, otherwise you will crash the game.  r2 is the small read-only data anchor.  Never ever ever write to r2, it will crash for sure.

In general, if you have to use a register and you don't know what registers are safe, you can use r12 safely 99.99% of the time.  The ultimate in safety is making your own stack frame, but you can usually get away with r12 and working your way down.

daijoda

This is neat. So, the "next code address" basically means that code's relative address in the .gct with respect to other codes that also exist in the .gct... haha, I thought it meant some other address. Thanks for the clarification.

Within a C2 code, the location of the destination branch is counted by its distance from the current line, correct? If the start of a branch is the 4th line below the current line, then a branch to it could look like "beq- 0x10"? But what if I want to branch to an existing line outside of the C2 code, how do I count the lines for the address?

Say, I have C2888888 (please ignore the instructions, they're not supposed to make sense, they're just there to take up space):

addi r11,r11,1
subi r12,12,1
cmpwi r12,0
beq- ????
li r12,0
li r11,100

What value should I write for ???? if the beginning of that branch is at 0x80888A0?

And while I'm at it, I'm also unsure about creating brand new branches at new locations:

C6004400 90100000

Does this make 0x80004400 say "b 0x90100000" when you go to the disassembly tab? And if so, can I assume that I can just string write some asm instructions starting at 0x90100000 and the game can read them as if they're native to the game?

dcx2

#4
When you tell a loader to load SD cheats, it copies the gct as a binary blob to the end of the code handler.  Therefore, to say "the code's relative address in the gct" is wrong, because the code handler never sees the gct itself, and it's not a relative address, but an absolute address.

If you're using a 1931-style code handler, the codes will be loaded at 800028B8-80003000.  The first two words will always be 00D0C0DE 00D0C0DE (do code).  So the actual list of codes will start at 800028C0.  If the very first code line was 4E000000 00000000, then it would put 800028C8 into the po; this code line was at 800028C0, so the next code line is +8.  If the second code line was also the 4E code, then it would put 800028D0 into the po.  So depending on what line a specific code is at, the 4E code will put a different pointer into the po.  If you use a different code handler, the addresses will be different, but the 4E code will always make sure that the po is pointing to the right code, regardless of what order the codes are in or what code handler you're using.  (it would also change depending on whether you loaded the debugger, or just the code handler)

---

This also ties in to the "string write some ASM".  That is how it was done in the "old days".  But it was tedious, because you had to find some safe memory, and hook it up with a couple branches.  There is also a major flaw; string writing ASM will mean there are *two* copies of your ASM in memory.  One copy is in the code handler (between 80002XXX and 80003000), and one copy is where you did your string write!

That's why it's just better to use the C2 code.  The ASM doesn't have to be string write'd anywhere; it already lives in the code handler.  The C2 code will also make sure there are branches from the game to the beginning of the code in code-handler-memory, and from the end of the code back to the game.

It's important to recognize that C2 codes do not insert ASM at the hook address.  They over-write the hook with a branch to your code.  This is why C2 codes must always have the "original instruction" somewhere, because that instruction is replaced by the branch-to-code.

---

It is tedious to count bytes in order to determine branch destinations.  It is much easier to let the assembler do that for you.  Computers are much better at counting than humans are.

addi r11,r11,1
subi r12,12,1
cmpwi r12,0
beq- FOO
li r12,0
FOO:
li r11,100

This will branch over the li r12,0 if the r12 == 0.  The beq- will "jump" execution to the branch label.

---

You cannot branch to absolute addresses.  All branches on the Wii will be relative branches.  This means you need to know how far away your destination is from the current address.

However, because C2 codes may exist at any address in the code handler, you cannot determine the relative offset needed.  This means you cannot branch "out of" C2 codes.

There is an option, though; you can load an absolute address into ctr and then bctr.

lis r12,0x8088
ori r12,r12,0x8A00
mtctr r12
bctr

---

You are on the right track with the C6 code.  However, I have never seen the Wii execute ASM from MEM2.  I'm not sure that it's strictly impossible...but there are major complications with trying to do so.  The biggest problem is that branches are always relative, and an unconditional branch only has 26 bits available for adding/subtracting to the current address (technically only 24 since you can't use the lower 2 bits).  Branching from MEM1 to MEM2 would require 29 bits (technically 27).  You might be able to bctr to MEM2, but you would also need to bctr back to MEM1.

Now if you were branching inside MEM1 (e.g. C6004400 80100000), then yes it would work much like you suggest (80004400: b 0x80100000).  And yes, you could string write your ASM and then C6 to it (don't forget to C6 back!  or hard code a branch back).  However, you would still have two copies of your ASM (the copy in the string write code itself, and the copy that the string write makes).  A C2 code simplifies that process of string writing ASM and branching.

Stuff

[spoiler]You might be asking WTF use is that?[/spoiler]XD. I was asking that. Also, I didn't notice that subi r0, r0. I just copied it as an example. Bad one, I guess.

next code address mean relative to the code's address in memory.

Branching somewhere out of a C2 can be weird, since it's relative to where the branch is and the C2 can be anywhere.
dcx2 beat me to this. I was in school >.<. Here's a chopped up example of how I got around that.

MID auto target as C2 and not taking up random space
C211F010 0000000C
3E009015
81F0807C 3A000000
55EA273E 280A0009
4082002C 7DEF1850
2C0F0000 41800014
4182000C 3A000B18
7E0F8396 3A100001
3DE0817C 9A0F4FFC
3DE08012 61EF6690
7DE803A6 81F407A4
4E800021
60000000 00000000

Had to take some things out since idk how the original author(s) would feel if I released a version that works on/offline out of nowhere. So unless they say it's ok, it's not getting released. This chopped up version is identical to the auto target Doudley posted(functionally). Just not in some location taking up 2x the code's size.

lis r16,-28651
lwz r15,-32644(r16)
li r16,0
rlwinm r10,r15,4,28,31
cmplwi r10,9
bne- 0x002C
sub r15,r3,r15
cmpwi r15,0
blt- 0x002C
beq- 0x002C
li r16,2840
divwu r16,r15,r16
addi r16,r16,1
lis r15,-32388
stb r16,20476(r15)
lis r15,-32750        ##
ori r15,r15,26256   ##
mtlr r15               ##make LR 0x80126690
lwz r15,1956(r20)
blrl                     ##branch to 0x80126690(dcx2 says bcrtl is the right way though)
00000000  #branch back

Originally, auto target ended like this:
817C6040:  4A960651   bl   0x80126690  ##not sure why it branches to this, but I assume it ends with a blr that comes back to this function. Let's not break it and just include it.
817C6044:  4A958FD0   b   0x8011f014  ##branch back
(0411F010 496A6FF0)
8011F010:  496A6FF0   b   0x817c6000  ##branch to beginning of the code >.>C2?

So yeah. This is why I was asking about if blrl existed.
.make Stuff happen.
Dropbox. If you don't have one, get it NOW! +250MB free if you follow my link :p.

Mod code Generator ~50% complete but very usable:
http://dl.dropbox.com/u/24514984/modcodes/modcodes.htm

daijoda

Oh! I had no idea you could just write "FOO" verbatim in the ASM Wiird converter! So, as long as you use all caps, you could choose any word for the label for a branch?

To branch to an absolute address bctr style, can I still use an if...condition for the branch?

If I use "bctrl" for the branch, and use "blr" at the end of that branch, does that make it come back to the line right below the "bctrl" instruction?

A branch that at the end branches back to the beginning of that branch, would essentially block the game's original "flow" of instructions between branch-1 line and branch+1 line?

How would "00000000" be treated if it's inserted in the middle of an instruction "flow"? Does the game view this as "nop", or would it crash the game?

dcx2

I don't think the caps are required.

I just use all caps (and typically a leading underscore) for convention.  You can also .set variables; I also use all caps for variables but no leading underscore (again, just my personal convention).  This way, when I'm looking at some ASM I wrote, I will know that _FOO is a label, but BAR is a variable.

One thing is required though.  The label itself must end with a :.  And there can be only one of these labels.  Unless you use single-digit-branch labels; those can be re-used, but are referenced with e.g. 0f or 0b.

Here's a post with some example ASM.  It uses full branch labels, single digit branch labels, and variables.  http://wiird.l0nk.org/forum/index.php/topic,8768.msg74666.html#msg74666  (EDIT: look in the spoiler)

---

You can still cmp/beq before a bctr.  You may even be able to cmp/beqctr.  I think that's what you mean by "if...condition"

Yes, if you bctrl and there's a blr at the end, it will come back.  This of course assumes that the PowerPC EABI calling convention is followed by whatever you bctrl'd to.

Yes, you can get stuck in an infinite loop and it will block the execution of the game.  This is in fact how breakpoints and pause work; during a bp or pause, the code handler is endlessly looping, preventing the game from processing anything.  Once the game is put back into the run condition, the code handler allows execution to return to the game.

Executing 00000000 will crash the game with a Program Exception.  If you've attached with the latest version of Gecko.NET, then it will automatically install the Program Exception handler so you can catch them.  If you use an old Gecko.NET or WiiRDGUI, then you only get data, instruction, and trace exceptions.