So I'm trying to write a code which interacts with the USB Gecko to send/receive custom data to a PC app. So far, I can make the code send data over the Gecko with no issues. However, trying to have the code receive data from the Gecko into a buffer in the Wii's RAM immediately causes the game to crash. It is further complicated by the fact that since my code takes over the USB Gecko, trying to debug the ASM line-by-line via Gecko.NET is not possible.
The code never crashes without the branch to the code handler's exireceivebuffer routine (although obviously it doesn't receive any data). When I do branch to that routine, something crashes, although I can't tell for sure where in the routine it crashes (since I can't debug while my code is trying to use the Gecko).
So, here's the entire ASM:
[spoiler]# save stack frame
stwu r1,-0x80(r1)
stmw r2,8(r1)
# save LR to r8 and CTR to r6
mflr r8
mfctr r6
la r9, 0(r20) # set r9 to PO... this will need a tweak whenever the code handler is modified... r20 in GameCube code handler... r16 in Wii code handler
la r11, 4(r9) # set r11 to PO+4
receivesegment: # start of loop
lwz r12, 0(r9)
lwz r3, 0(r11)
cmpwi cr0, r12, 0 # compare r12 to 0... was cmpdi r9, 0, but apparently that doesn't work on 32-bit PowerPC
beq endsegment # break the loop if we hit the null
# call the code handler routine... this will need a tweak whenever the code handler is modified... 0x80001E00 for GameCube
lis r7,0x8000
ori r7,r7,0x1E00
mtctr r7
# run the Gecko Receive routine within its own stack frame
stwu r1,-0x80(r1)
stmw r2,8(r1)
bctrl # this is the actual branch THIS IS WHERE IT CRASHES
lmw r2,8(r1)
addi r1,r1,0x80
la r9, 8(r9) # r9 += 8
la r11, 8(r11) # r11 += 8
b receivesegment # end of loop
endsegment:
# restore LR from r8 and CTR from r6
mtlr r8
mtctr r6
# restore stack frame
lmw r2,8(r1)
addi r1,r1,0x80[/spoiler]
This ASM is compiled into a C0 code, and executed immediately after the following Ocarina code:
4E000008 00000000 // Load P2 address list pointer into PO
66000002 00000000 // Jump over P2 address list
80CD3614 0000000C // P2 XYZ
00000000 00000000 // Null terminator, end of address list
The entire code is supposed to read data from the Gecko and write 0x0C bytes of it to 0x80CD3614, and then stop when it hits the line with the 0's. (But if I were to have more lines of addresses/lengths in that block, it would cycle through them until it reached the 0 line.) This particular address is the XYZ coords of player 2 in Sonic Adventure 2: Battle for GameCube. (I'm running this with the GameCube code handler.)
Changing the branch address from 0x80001E00 to 0x80001E34 (which is the exisendbuffer routine) does work (it sends 0x0C bytes over the USB Gecko from 0x80CD3614).
So, that's my situation. Any suggestions on why receiving data doesn't work properly but sending does? I'm fairly new to ASM. (And I apologize for the long post here; hopefully the technical details I'm providing are useful.)
Thanks!
EDIT: In case it's any help, here are the relevant disassembled routines from the GameCube code handler I'm using, since it's somewhat different from the Wii code handler that is open-source:
[spoiler]800036BC: 7FC802A6 mflr r30 #exireceivebyte
800036C0: 3C60A000 lis r3,-24576
800036C4: 48000019 bl 0x800036dc
800036C8: 74C30800 andis. r3,r6,2048
800036CC: 54C5843E rlwinm r5,r6,16,16,31
800036D0: 98AD0000 stb r5,0(r13)
800036D4: 7FC803A6 mtlr r30
800036D8: 4E800020 blr
800036DC: 92F86814 stw r23,26644(r24) #checkexisend
800036E0: 90786824 stw r3,26660(r24)
800036E4: 92D86820 stw r22,26656(r24)
800036E8: 80B86820 lwz r5,26656(r24) #exicheckreceivewait
800036EC: 70A50001 andi. r5,r5,1
800036F0: 4082FFF8 bne+ 0x800036e8
800036F4: 80D86824 lwz r6,26660(r24)
800036F8: 90B86814 stw r5,26644(r24)
800036FC: 4E800020 blr
80003700: 7D4802A6 mflr r10 #exireceivebuffer... r3 counter, r12 buffer... 0x80001E00
80003704: 7C6903A6 mtctr r3
80003708: 39C00000 li r14,0
8000370C: 4800007D bl 0x80003788 #bufferloop
80003710: 48000079 bl 0x80003788
80003714: 4BFFFFA9 bl 0x800036bc
80003718: 4182FFF4 beq+ 0x8000370c
8000371C: 888D0000 lbz r4,0(r13)
80003720: 7C8E61AE stbx r4,r14,r12
80003724: 39CE0001 addi r14,r14,1
80003728: 4200FFE4 bdnz+ 0x8000370c
8000372C: 7D4803A6 mtlr r10
80003730: 4E800020 blr
80003734: 7D4802A6 mflr r10 #exisendbuffer... r3 counter, r12 buffer... 0x80001E34
80003738: 7C6903A6 mtctr r3
8000373C: 39C00000 li r14,0
80003740: 7C6C70AE lbzx r3,r12,r14 #sendloop
80003744: 4800001D bl 0x80003760
80003748: 4182FFF8 beq+ 0x80003740
8000374C: 39CE0001 addi r14,r14,1
80003750: 4200FFF0 bdnz+ 0x80003740
80003754: 7D4803A6 mtlr r10
80003758: 4E800020 blr
8000375C: 386000AA li r3,170 #exisendbyteAA
80003760: 7FC802A6 mflr r30 #exisendbyte
80003764: 5463A016 rlwinm r3,r3,20,0,11
80003768: 6463B000 oris r3,r3,45056
8000376C: 3AC00019 li r22,25
80003770: 3AE000D0 li r23,208
80003774: 3F00CC00 lis r24,-13312
80003778: 4BFFFF65 bl 0x800036dc
8000377C: 54C337FF rlwinm. r3,r6,6,31,31
80003780: 7FC803A6 mtlr r30
80003784: 4E800020 blr
80003788: 7FC802A6 mflr r30 #exicheckreceive
8000378C: 3C60D000 lis r3,-12288 #exicheckreceive2
80003790: 4BFFFF4D bl 0x800036dc
80003794: 54C337FF rlwinm. r3,r6,6,31,31
80003798: 4182FFF4 beq+ 0x8000378c
8000379C: 7FC803A6 mtlr r30
800037A0: 4E800020 blr
[/spoiler]
I really don't like bumping posts, but does anyone have suggestions on this? dcx2? brkirch? Anyone?
bump. I want to know how this turns out D:
biolizard89,
You could be over complicating things as upload and download from memory commands are already available, couldn't you use those? Just make your PC app to interact with those and let the handler always do its thing.
If i remember the 8 bytes are just the start and end address you, would use 0x80CD3614, 0x80CD3614+0C
uploadcode:
bl exisendbyteAA
li r3, 8 # 8 bytes
ori r12, r31, dwordbuffer@l # buffer
----
readmem:
bl exisendbyteAA
li r3, 8 # 8 bytes
ori r12, r31, dwordbuffer@l # buffer
Hi Nuke,
Thank you very much for the reply.
I am not using the standard upload/download commands which WiiRd uses because they require a round trip between the PC and the Wii before the data is transferred. This code is intended to dump and upload to many memory ranges every frame, and waiting for a ping for each memory range would severely slow down the game. My method, at least for dumping the memory, is able to dump a memory range every frame with no noticeable slowdown. I imagine that having an upload as well as a dump will slow down the game slightly, but it will be one round trip per frame instead of the 10 or so round trips which using the standard commands.
In addition, my method allows a Gecko code to control what memory ranges get written to. So it would allow access to data that uses pointers, which the standard commands do not permit (unless I made more round-trip pings to interpret the pointers).
Any suggestions? The comments in the code handler imply that all I need to do is set r3 and r12 with the byte counter and the start address, and branch-link to the exireceivebuffer routine. Am I reading the code wrong? Are there some other registers that I need to be careful with?
Thanks!
I now understand.
What I would suggest is to write a custom handler using the Wiird handler as a base. You only need to change the EXI registers from 0xCD to 0xCC values in the asm for it to be compatible with gamecube. And maybe you could then upload the handler and overwrite the existing one.
I can't say where or why your asm is crashing, but with asm things can easily go wrong as its not so user friendly. The only reason the Gecko handler is so tight optimized asm, as we know from the very beginning we had a very tiny space to work with.
sorry I can't help more than that.
Quote from: Nuke on October 12, 2010, 04:49:27 PM
I now understand.
What I would suggest is to write a custom handler using the Wiird handler as a base. You only need to change the EXI registers from 0xCD to 0xCC values in the asm for it to be compatible with gamecube. And maybe you could then upload the handler and overwrite the existing one.
I can't say where or why your asm is crashing, but with asm things can easily go wrong as its not so user friendly. The only reason the Gecko handler is so tight optimized asm, as we know from the very beginning we had a very tiny space to work with.
sorry I can't help more than that.
Thanks Nuke! I was unaware that porting the newer code handler to GameCube was a simple matter of changing the EXI registers. I'll play around with it.
That said, I don't want to remove the codetype functionality of the code handler, as I would like Gecko codes to have control over which memory ranges get dumped/loaded, and for miscellaneous Gecko codes to be usable simultaneously. I recall that the new code handler supports moving the code list to a custom empty memory range... I take it that would give me extra memory to implement my functionality? Maybe stick the C USB Gecko library (about 2KiB) in that memory range so that I don't have to figure out the weirdness of why the ASM USB Gecko lib that the code handler uses is crashing for me? Does that sound like a decent plan to try?
Thanks!
Quote from: biolizard89 on October 16, 2010, 07:55:57 PM
Quote from: Nuke on October 12, 2010, 04:49:27 PM
I now understand.
What I would suggest is to write a custom handler using the Wiird handler as a base. You only need to change the EXI registers from 0xCD to 0xCC values in the asm for it to be compatible with gamecube. And maybe you could then upload the handler and overwrite the existing one.
I can't say where or why your asm is crashing, but with asm things can easily go wrong as its not so user friendly. The only reason the Gecko handler is so tight optimized asm, as we know from the very beginning we had a very tiny space to work with.
sorry I can't help more than that.
Thanks Nuke! I was unaware that porting the newer code handler to GameCube was a simple matter of changing the EXI registers. I'll play around with it.
That said, I don't want to remove the codetype functionality of the code handler, as I would like Gecko codes to have control over which memory ranges get dumped/loaded, and for miscellaneous Gecko codes to be usable simultaneously. I recall that the new code handler supports moving the code list to a custom empty memory range... I take it that would give me extra memory to implement my functionality? Maybe stick the C USB Gecko library (about 2KiB) in that memory range so that I don't have to figure out the weirdness of why the ASM USB Gecko lib that the code handler uses is crashing for me? Does that sound like a decent plan to try?
Thanks!
Okay, I got bidirectional USB Gecko transfer working in a GeckoOS code. It uses 119 lines of GeckoOS codes, and has been minimally tested (read: it probably crashes in many situations which I didn't test), but it does appear to function now. Very little impact on game performance, either, although I did need to lower the USB Gecko's latency timer. At the moment the code has no practical use, but once I develop a useful app that uses the code (and do further testing and cleanup), I will release it. Thanks for your help Nuke!
I'm pretty sure that I added a command to send and receive at the same time in the first version of the VHDL which you may have. I never documented it or released it public.
I think it was command 0xC i.e EXI_CHAN1DATA = 0xC0000000;
One way to test is make some test code for both read and write and use 0xC instead of 0xA, 0xB
The vhdl I released is actually missing this and also the ID stuff, as I think I just uploaded the wrong version v2.0 which V1.0 is actually the one people have.
Please try it and let me know.
Edit: I think 0xC was changed to a check TX command later on, so I will dig out my old sources and check this for you.
Quote from: Nuke on October 22, 2010, 05:43:31 AM
I'm pretty sure that I added a command to send and receive at the same time in the first version of the VHDL which you may have. I never documented it or released it public.
I think it was command 0xC i.e EXI_CHAN1DATA = 0xC0000000;
One way to test is make some test code for both read and write and use 0xC instead of 0xA, 0xB
The vhdl I released is actually missing this and also the ID stuff, as I think I just uploaded the wrong version v2.0 which V1.0 is actually the one people have.
Please try it and let me know.
Hey Nuke, thanks for the reply and info! The USB Gecko lib I have on hand includes this:
/*---------------------------------------------------------------------------------------------*
Name: usb_checksendstatus
Description: Chesk the FIFO is ready to send
*----------------------------------------------------------------------------------------------*/
static int __usb_checksendstatus()
{
s32 i = 0;
exi_chan1sr = 0x000000D0;
exi_chan1data = 0xC0000000;
exi_chan1cr = 0x19;
while((exi_chan1cr)&1);
i = exi_chan1data;
exi_chan1sr = 0x0;
if (i&0x04000000){
return 1;
}
return 0;
}So it looks like command 0xC is used to check the send FIFO status. 0xD is apparently used to check the receive FIFO status. Would your undocumented function be command 0xE? If you can post some documentation (C code would be awesome) on what I would have to do to send and receive something simultaneously, I would happily test it when I have a chance (although I admit I don't have loads of free time to do this testing). So I would be giving this: exi_chan1data = 0xE0000000 | (sendbyte<<20); assuming that it's command 0xE? Should I expect the EXI bus to return 0x0C000000 if both the send and receive worked, since send is 0x04000000 and receive is 0x08000000? How will this behave if a byte is sent but not received? Would I use an ioctl on the PC side? If so, what control code would I specify to the FTDI driver's ioctl on the PC?
Thanks again Nuke, your expertise is very helpful!
It is full duplex so it can send and receive at once, I just didn't document it and later removed the command. I will check my old backups when i go home tonight. It is the VHDL code that handles the procedure, it just does both read and write operations at once.
Its not commands E and F as those are used to control the flash memory.
Try ,0x6,0x7,0x8 if it doesn't work then I probably did remove it in the released version.
i = exi_chan1data;
put a printf("%x\n",i); after the above line and it will tell you the returned value. yes it should return 0xC :)
mmmm...VHDL....
I'm interested in this, as it could potentially benefit Gecko.NET, too.
Quote from: Nuke on October 22, 2010, 06:20:36 AM
It is full duplex so it can send and receive at once, I just didn't document it and later removed the command. I will check my old backups when i go home tonight. It is the VHDL code that handles the procedure, it just does both read and write operations at once.
Its not commands E and F as those are used to control the flash memory.
Try ,0x6,0x7,0x8 if it doesn't work then I probably did remove it in the released version.
i = exi_chan1data;
put a printf("%x\n",i); after the above line and it will tell you the returned value. yes it should return 0xC :)
Hmm, well, my implementation using the half-duplex commands appears to be fast enough for now... I will try implementing it using the full-duplex command when I have some down time, but I'm not certain when that would be.
That said, one question: how would the code look like on the PC-side? The FTDI driver has receive and send functions that don't appear to be full-duplex or thread-safe, and it appears to me that if I initiate sending bytes, the driver will wait until all the bytes are sent before it will let me receive, and vice versa. There's an ioctl function with no documentation; is that required? Or is there some other method that is necessary?
Thanks!
You could take advantage of it properly if threaded, but I never tried it further than a test, which is probably the reason I removed it. I've added back to USB Gecko SE, so if you want to try it out send me a PM.