Hi, got a question which may sound stupid, but here goes.
I'm interested in an application on the PC side which needs to send a large amount of peek and poke commands to WiiRD. I'm wondering how many 32-bit peek and/or poke commands can be executed per second with WiiRD, without adversely affecting the speed of the running game. As a rough guess, I'm hoping for roughly 18 peeks plus 18 pokes per frame. Less could be worked with, more would be excellent.
Thanks!
I've no clue what would be the max. What do you want to achieve ?
Because that's what the code handler is made for anyway, peeking and poking around...
the maximum is not limited myself.. though when the game is running the code handler is executed 60 times a second - once for each frame.. do you really mean 36 operations PER FRAME? Or do you mean 36 operation (18 poke+18 peek) per second. I guess per second might be possible if you develop a very fast program. WiiRd isn't using the required environmenrts for that and defenitely won't be able to do that.
Hmm, ok. What I'm interested in is basically in the style of Sappharad's program GC-Online-Tunnel. The idea is, every frame (or every few frames, if trying to conserve bandwidth), peek the Wiimote data from the game, along with some game-state data to keep it synced (XYZ coords of the player, etc.), send that over the Internet or a LAN, receive such data from the remote player, poke that data into the local game, and repeat. The effect would be that any Wii game would be playable online, if a hacker determined the addresses to peek/poke to keep it synced. Based on my very rough estimates, given my experience in doing this for Smash Bros Melee with Fuzziqer Software's GCARS-CS, is that it could require up to 18 32-bit chunks of data for the more complex games to keep it synced. Meaning 18 peeks and 18 pokes, each frame that my program runs.
So what I'm understanding from you is that the WiiRd PC-side programs aren't capable of doing this, and I would have to write a very fast program myself to do this, but the Wii-side code could possibly handle this. Is this correct?
How many peeks/pokes can the Wii-side code handler process per frame, without drastically slowing down the game? (I understand that it executes every frame.)
Thanks, and I apologize again for the weirdness of this question.
(Sorry for the double post)
So I've been looking at usbgecko.pas a bit. I assume that if I call this code directly, rather than going through WiiRd, it would be much faster. Still unsure of some things, so I hope someone can answer some questions. Assuming I use the code in usbgecko.pas, roughly how many peek/poke commands could I execute per frame, and is this limited by the code handler running on the Wii, or is it limited by the bandwidth of the USB Gecko? Are there any optimizations that could be made to usbgecko.pas which would increase the number of commands per frame? Could the peek commands be queued into the USB Gecko before the data is returned back to the PC, or would that overflow a buffer? Would pausing the game before executing the other commands, and unpausing after the commands have executed, help increase the commands per frame, or is that irrelevant? I don't know any Delphi; is there a C/C++ version of usbgecko.pas somewhere, or would creating a DLL be my best option for using this code from a C++ program?
Thanks again!
Technically, it shouldn't be too hard to rewrite usbgecko.pas to C/C++ - the functions you need to port are connect, initgecko, peek and poke - and there is also public C code over at http://www.usbgecko.com - look at the memory dumper - it has some C source code.. you just need the FTDI libraries and I'd happily assist you in what data to send to efficiently get high speed pokes/peeks.
Cool, I took a look at the memory dumper source; it looks like it would be helpful. I can't find the C/C++ version of libftdi.pas, though, which it looks to me like it's necessary for some of the code in usbgecko.pas to work. Any idea where I could find that code?
So for high-speed peeks/pokes, any idea how many would I likely be able to execute per frame, if I optimize my program well? Would 36 operations, executed every 6 frames or so, be likely to slow down the game unpleasantly?
Thanks!
Download the newest driver package for the libftdi chip which is in the USB Gecko:
The zip archive will contain ftd2xx.h - which is the header file for ftd2xx.lib n the i386 and amd64 directories (probably i386 as that's the platform must of us still build for). Then all commands should work and the mem dumper should be compilable.
Cool, I'll be sure to check that out. The connect code in usbgecko.pas and the C/C++ memory dumper seem to have some differences; any reason for that? E.g. some chunk size stuff, some other stuff that I noticed but don't remember offhand. Would I have to port the usbgecko.pas version to C++, or would the existing memory dumper connect code work fine? They seem to use a different Wii executable (memory dumper doesn't use the official GeckoOS, it uses a .elf that comes with the dumper); would that account for the differences?
And I apologize for repeating this question, but my pursuit of this project heavily depends on the likely performance of the setup, so: for high-speed peeks/pokes, any idea how many would I likely be able to execute per frame, if I optimize my program well? Would 36 operations (18 peeks, 18 pokes), executed every 6 frames or so, be likely to slow down the game unpleasantly? What about running the 36 operations every frame (which would be overkill and I don't need, but would be kickass)?
Thanks again! :)
That many operations will most likely slow down the process.. especially.. a peek is seperated into multiple commands:
-peek announce (don't remember the ID)
-waiting for reply.. should be AA = success
-sending address (big-endian)
-waiting for reply
Thus I guess 36 operations every frame will slow down the game completely.
For the headers.. the code of the mem dumper works.. the chunk size was just tweaked around in our cases to get maximum stability. Other than that however, nothing should be changed - it is just splitted into connection and initialization - basically it's good to reinitialize the USB Gecko before several commands - like poke.
Hmm, ok, so since I need to do around 36 operations every 6 frames or so, would there be a lot of slowdown if I averaged it and did 6 operations every frame? Any estimate of how much slowdown that would cause? (A rough estimate is fine, e.g. would it reduce the speed by 80%, 50%, 20%, etc.?)
As for the peek operation, is there any particular practical reason why it needs to split it into two commands? Would it be possible to "queue" the commands by sending the second command before the response to the first one is received, so that the moment the Wii sends the AA, it already has the following data, so that it can continue immediately? If this is not possible, is there any particular practical reason for this setup?
Sorry for the abundance of questions here. :)
Thanks!
It might work by sending 5 bytes in one.. I must admit though that it is untested.
The reason for this setup - simple: we always have to expect commands to fail.. so for example if another command is running for some reason (crash or something similar on Wii side) we wouldn't receive AA.. in that case we try to send FF first - meaning "sendfail" which should reset all running commands - so that the Wii can reply properly again..
Hi,
yes you can do queue commands and use none synchronous transmission, and would work well if your sending small packets, because the usb chip is buffered i.e has 256 byte receive buffer and a 128 byte transmit buffer. It is only when the buffer is full and your sending larger packets that it needs to be handled correctly.
As USB Gecko was designed to be none IRQ driven you should really poll the FIFO buffer before sending though to make sure the buffer is not full.
You will have to experiment with your protocol, as there many ways to handle things. There is even an undocumented command to send and receive data at the same time, if your having problems with speed.
It sounds an interesting project good luck with it.
Quote from: biolizard89 on October 03, 2008, 12:19:23 AM
Hmm, ok, so since I need to do around 36 operations every 6 frames or so, would there be a lot of slowdown if I averaged it and did 6 operations every frame? Any estimate of how much slowdown that would cause? (A rough estimate is fine, e.g. would it reduce the speed by 80%, 50%, 20%, etc.?)
As for the peek operation, is there any particular practical reason why it needs to split it into two commands? Would it be possible to "queue" the commands by sending the second command before the response to the first one is received, so that the moment the Wii sends the AA, it already has the following data, so that it can continue immediately? If this is not possible, is there any particular practical reason for this setup?
Sorry for the abundance of questions here. :)
Thanks!
Okay, so based on that, I would probably send all the data for the frame to the Wii, and then receive the replies as they come back. If an error occurred, I would just ignore that data, send the fail command, and wait until the next frame. This sounds like it would work, since my program should still work fine if an occasional frame of data gets dropped. Does this sound to you like it would work okay?
Also, it occurred to me, some of my 32-bit variables are next to each other in memory (e.g. a player's X, Y, and Z coords are right next to each other), so if I ask the Wii to dump 12 bytes, would that be faster than peeking 3 32-bit values? If so, about how much faster? Similarly, about how much faster would uploading 12 bytes be compared to poking 3 32-bit values?
Nuke, when you say 256byte receive and 128byte send, is this from the point of view of the PC? I.e. 256byte buffer for Wii-->PC and 128byte buffer for PC-->Wii, or is this the other way around? Sorry if I'm being dense here. To check if the send buffer on the PC side is full, is the function of choice FT_GetStatus?
As for this "undocumented command"... is this FT_IoCtl? I could not find any info with Google on what it does; do you know how it works? If so, would you be able to share?
Thanks for the info.
It is buffered on the chip itself, the FIFO buffers.
You want to grab as many bytes as you can if its possible, the less amount of USB transmission the better. If you can grab 12 bytes at a time would be faster than grabbing 4 x 3.
about how well your program would work, it would be hard for me to say as its all speculation I don't know how a tunneling program would work. How do you actually share the player data with the other person?
How the program would work, roughly, would be, every few frames (I figure 10Hz is good enough), dump some variables which are useful to sync the Wiis (Wiimote data, plus some stuff like player XYZ coords, player health, player XYZ velocity, etc. -- what I sync depends on the game), and send the resulting data over either TCP or UDP to a remote player (TCP would be easier to program, as I wouldn't have to worry about data getting lost, but UDP would probably be faster... haven't really made that decision yet). When data arrives from a remote player, upload it back to the Wii. The setup is pretty similar to what Sappharad did with GC-Online-Tunnel using GCNrd, although my program would probably support more advanced stuff than what Sappharad did, since his program was basically a proof-of-concept put together in about 4 days. E.g. I would like my program to support pointers and conditionals (which would probably put more load on the USB connection, but hopefully that isn't a big deal). If a USB error occurs as Link said, I can wait until the next frame to send/receive the data, as long as the network connection between the PC's is reliable.
So yeah, Wii1 <--USB--> PC1 <--TCP/UDP--> PC2 <--USB--> Wii2.
(Sorry for the double post)
Here's what I think would be a nice feature in the code handler and WiiRD. Have a pointer version of the poke/upload/peek/dump commands. Data sent would be the pointer address, offset, data size, and data. This would, IMHO, make things a lot more efficient, as the PC would not have to parse the pointers and go back to the Wii to get the correct values. And in cases like mine, where efficiency is key, it could help greatly. It would also help deal with pointers in the GUI, since you could poke a pointer value directly, instead of navigating to the correct address by yourself, making it easier to evaluate the results of a pointer search.
Do you think this would be useful?
Thanks!
Phew.. there is some idea like that in the plan for the current WiiRd GUI. However, currently things are still parsed on PC side. Parsing on Wii side.. well.. possible.. however, most of the time our code handler wasn't optimized for functionality and speed.. it was optimized for being as small as possible while still having some proper functionality. However: you can modify the code handler.. download the sources of Gecko OS 1.06f which are publically availible.. kenobiwii.s or so (can't remember the exact file name) is the code handler. You can modify it and recompile Gecko OS. THen it would have your wished functionality.
Thanks for the info. Unfortunately, I do not know any ASM. :-[ I can do C/C++, but ASM is something that I have never been able to get good at. Thus, I can tell by the comments sort of what is going on at a high level, but actually adding a command... not something I could really do.
I understand that the code handler has to be very small; otherwise, it would probably impact compatibility. If I recall correctly, the code handler is about 4KiB right now; any idea how much Wii-side pointer upload/dump commands would add to that? Would the difference be likely to screw up games?
I'm certainly not going to ask you to compromise game compatibility just for a minor feature I'd like. But if you felt like adding the feature, I certainly wouldn't complain. If nothing else, I could probably find someone who knows ASM who would be willing to make that modification for me.
Thanks!
If the parsing was done wii side i wouldn't care if my fps dropped a bit.
Quote from: biolizard89 on October 14, 2008, 06:43:04 AM
I understand that the code handler has to be very small; otherwise, it would probably impact compatibility. If I recall correctly, the code handler is about 4KiB right now; any idea how much Wii-side pointer upload/dump commands would add to that? Would the difference be likely to screw up games?
Technically speaking we'd have place to add such a command. The main problem is, if we really want.. we wanted to keep the code handler small to allow as many codes as possible (256 lines as of now). Adding commands will reduce this so we'd defenitely not add that command to retail versions as of now... I'll have to see for other things though.
Hmm, does ASM make it easy to do conditional compilation like C/C++'s #ifdef directive? If so, maybe it could be added to the source code but disabled in official builds, maybe with an alternative build available with the feature included?
While I'm talking about additional features for the code handler, something else that would be cool would be adding a code-type that dumps a certain area of RAM over the USB Gecko (without waiting for the PC's response). Combining that with a conditional counter code, the code handler could then dump all the data I need every, say 6 frames, without the USB Gecko overhead of repeated dumping. Have 2 sub-types, one for base address and one for pointer, and have the value part of the code be the number of bytes to dump. This would actually be useful for WiiRD GUI as well, as a hacker could "watch" 4-5 different RAM values while hacking, without USB Gecko overhead. And again, this could be a conditionally compiled option so that people who are just using the Ocarina feature can have as many codes as possible.
And by the way, thanks for at least taking my ideas seriously. There are so many developers out there who dismiss ideas without discussion because not many people would use a feature, and it's nice to see that, regardless of whether my suggestions get implemented, you're willing to discuss them. We need more developers like that, so thanks.
Quote from: Link on October 15, 2008, 07:11:25 AM
Quote from: biolizard89 on October 14, 2008, 06:43:04 AM
I understand that the code handler has to be very small; otherwise, it would probably impact compatibility. If I recall correctly, the code handler is about 4KiB right now; any idea how much Wii-side pointer upload/dump commands would add to that? Would the difference be likely to screw up games?
Technically speaking we'd have place to add such a command. The main problem is, if we really want.. we wanted to keep the code handler small to allow as many codes as possible (256 lines as of now). Adding commands will reduce this so we'd defenitely not add that command to retail versions as of now... I'll have to see for other things though.
You could implement Gecko OS to read the codes and see if it needed the new code handler, newer codes would need the new code handler so it would have less lines.
If it didnt have newer codes just run the old handler.
Quote from: Igglyboo on October 16, 2008, 03:19:38 AMYou could implement Gecko OS to read the codes and see if it needed the new code handler, newer codes would need the new code handler so it would have less lines.
If it didnt have newer codes just run the old handler.
Actually, yeah, that's probably better than my idea with the conditional compilation. Just have both in GeckoOS and decide which one to load. That said, the suggestions I made would probably be used in conjunction with WiiRD or some other way of dynamically using them through USB Gecko, so rather than reading the Ocarina code list to determine what code handler to use, just give users a choice of "Standard" or "Debug" handlers. Make Standard the default, maybe bury it in a menu so that people who don't know what they're doing will ignore it, and make it clear in the documentation that if you want to use the advanced debug features that my suggestions would use, you need to tell GeckoOS to use the Debug handler. People who are debugging probably don't need 256 lines of codes, and people who are just using the Ocarina features probably don't need those debugging features, so that sounds like a win-win.
Does that sound possibly workable?
Thanks!
(Double post again, sorry)
Actually, it just occurred to me that a feature which might minimize the need for the aforementioned features would be a command for dumping/uploading the Gecko Registers. I know you can do this now with a hardcoded address, but a future-proof option would be a lot safer. If I had the ability to dump/upload the Gecko Registers, I could rig up a simple WiiRD code that copied that data into and from the actual variable addresses, and I'd be able to use pointers with minimal USB Gecko transfer.
This would also be useful for debugging codes that use the Gecko Registers.
Does this sound like a useful feature?
Thanks!
EDIT: Actually, now I'm not sure... are the Gecko Registers' addresses safe to hardcode? The code type documentation implies that they are safe. Is this true? Thanks.
(Apologies for the repeated posts here)
Quote from: Nuke on October 03, 2008, 10:16:23 AMYou will have to experiment with your protocol, as there many ways to handle things. There is even an undocumented command to send and receive data at the same time, if your having problems with speed.
I'm assuming this is FT_IoCtl. From looking at what arguments it takes, it looks like I can give it a send and receive buffer and it will process both simultaneously. This would be quite helpful. However, there is one argument that I'm clueless on. What the heck does the DWORD dwIoControlCode argument do, and what value(s) can it accept? If you know, would you mind sharing?
Thanks.
(Please forgive me for the repeated posts, I'm not trying to spam here)
Okay, here's a performance question. If, every frame, all I do is upload 8352 bytes and dump 288 bytes, is this rough level of USB Gecko activity going to slow down the game significantly? I assume changing the packet size would be necessary to get high efficiency, but that isn't a big deal. Any idea how much data I can upload or dump in one operation each per frame and not slow down the game significantly? This is pretty much the maximum level of usage I can imagine, I'm wondering if it would be anywhere close to workable.
Thanks! :)
As far as I remember upload/download data is sent in blocks.. uploads are in 0xF80 blocks, downloads in 0xF800 blocks. As long as your below that block size, uploads are done in one cycle. Otherwise two or more!