View previous topic - View next topic |
Author |
Message |
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Fri Aug 26, 2005 2:40 pm Post subject: SNES debugging emulator |
[quote] |
|
My god, I can't believe how easy this is!
When I'm not working on Endless Saga or busy chasing down a checkbook down at the bank to move into my new appartment, I do some ROM hacking. Not the pityful "open the rom in an editor and tweak stuff" kind, but the raw and gory hex-tweaking/assembly-editing kind. My primary interest lies in dumping a game's code and figuring the whole thing out.
Needless to say, I have a hefty amount of experience with the 65816 processor. And even more needless to say, getting my hands on some tools to debug code makes the task far easier. :P But I have a sick aversion to using others' code and programs in my own personal projects and write my own utilities. As a result, I have, among other things, an intelligent 65816 decompiler. Writing this instead of using someone else's taught me a heap of stuff about the 65816 and was, all in all, a very enriching experience. The kind I wouldn't have gotten if I had just hunted around for a pre-written program. :P
Lately, I've had an idea that would simplify my work quite a bit: something that would be able to take 65816 code, a few initial values, and run a function and provide a step-by-step trace of what goes on. Sure as hell beats doing it all by hand! Seemed easy enough, too, as assembly isn't exactly a difficult language to learn despite what people think. Most instructions are basically single bitwise operations (x = y, x |= y, x += y, x -= y, x &= y...) with a few modifications thrown in depending on the state of the processor flags.
It's been two days and I'm slowly finding myself writing an emulator now (sans hardware addresses; though by the time I'm done I'll have to emulate at least a few basic ones, such as the arithmitic coprocessor and mode 7 matrix operations.) And I'm blown away by how easy this actually is. Granted, low-level code is definately my forte, but still, I was expecting a major mess. And after two days I've pretty much attained my initial goal. Now all's I can do with this is finish writing the handfull of missing opcodes and implementing hardware addresses. Two days and it's rapidly becomming a basic but fully-functional emulator. Huh...
Although I never have and never will release open source code, I fully intend to write a very in-depth document on how the experience went and how everything was implemented for anyone interested in partaking in this sort of experience. All's I have to say is this: "Wow. This shit is FUN. :P" _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
Mandrake elementry school minded asshole
Joined: 28 May 2002 Posts: 1341 Location: GNARR!
|
Posted: Fri Aug 26, 2005 2:47 pm Post subject: |
[quote] |
|
Fuck Opening the source, this document sounds much more beneficial, and I for one would love to read it. _________________ "Well, last time I flicked on a lighter, I'm pretty sure I didn't create a black hole."-
Xmark
http://pauljessup.com
|
|
Back to top |
|
|
tcaudilllg Dragonmaster
Joined: 20 Jun 2002 Posts: 1731 Location: Cedar Bluff, VA
|
Posted: Fri Aug 26, 2005 7:23 pm Post subject: |
[quote] |
|
Decompiler or disassembler?
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Fri Aug 26, 2005 8:38 pm Post subject: |
[quote] |
|
LordGalbalan wrote: | Decompiler or disassembler? |
Neither. This emulates 65816 code (along with a few SNES hardware addresses.)
Edit: Currently covering 36 opcodes. Most of the ones not yet covered are different addressing methods for the ones already covered, so it's just a matter of writing the code. Works like a charm so far. _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
Last edited by RuneLancer on Sat Aug 27, 2005 10:40 pm; edited 1 time in total
|
|
Back to top |
|
|
tcaudilllg Dragonmaster
Joined: 20 Jun 2002 Posts: 1731 Location: Cedar Bluff, VA
|
Posted: Fri Aug 26, 2005 11:11 pm Post subject: |
[quote] |
|
Quote: | As a result, I have, among other things, an intelligent 65816 decompiler. |
I mean that. Is it a real decompiler, or a dissassembler?[/quote]
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Sat Aug 27, 2005 12:12 am Post subject: |
[quote] |
|
LordGalbalan wrote: | Quote: | As a result, I have, among other things, an intelligent 65816 decompiler. |
I mean that. Is it a real decompiler, or a dissassembler? | Depends what you consider a decompiler. It decompiles to a pseudo-language, but for very obvious reasons, doesn't decompile to code that could be compiled. Not platform-independantly, at least. Compilable code would also have to include snes hardware. Which would pretty much amount to writing a self-contained emulator. Which would be dumb. :)
Code: | $ED = A
$E7 = Y
$EB = X
$E9 = 0x7E
Y = 0x0000
0F9C [$E7](Y) = [$EB](Y)
Y = Y + 0x02
If Y != 0x0020 Then 0F9C
Return |
Taken from...
Code: | STA $ED
STY $E7
STX $EB
LDA 0x7E
STA $E9
LDY $00
REP 0x20
0F9C LDA [$EB],Y
STA [$E7],Y
INY
INY
CPY 0x0020
BNE $0F9C
SEP 0x20
RTS |
_________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
DeveloperX 202192397
Joined: 04 May 2003 Posts: 1626 Location: Decatur, IL, USA
|
|
Back to top |
|
|
tcaudilllg Dragonmaster
Joined: 20 Jun 2002 Posts: 1731 Location: Cedar Bluff, VA
|
Posted: Sun Aug 28, 2005 7:02 pm Post subject: |
[quote] |
|
RuneLancer wrote: | LordGalbalan wrote: | Quote: | As a result, I have, among other things, an intelligent 65816 decompiler. |
I mean that. Is it a real decompiler, or a dissassembler? | Depends what you consider a decompiler. It decompiles to a pseudo-language, but for very obvious reasons, doesn't decompile to code that could be compiled. Not platform-independantly, at least. Compilable code would also have to include snes hardware. Which would pretty much amount to writing a self-contained emulator. Which would be dumb. :)
Code: | $ED = A
$E7 = Y
$EB = X
$E9 = 0x7E
Y = 0x0000
0F9C [$E7](Y) = [$EB](Y)
Y = Y + 0x02
If Y != 0x0020 Then 0F9C
Return |
Taken from...
Code: | STA $ED
STY $E7
STX $EB
LDA 0x7E
STA $E9
LDY $00
REP 0x20
0F9C LDA [$EB],Y
STA [$E7],Y
INY
INY
CPY 0x0020
BNE $0F9C
SEP 0x20
RTS |
|
Not perfect, as you said, but definitely a start.
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Sun Aug 28, 2005 8:10 pm Post subject: |
[quote] |
|
LordGalbalan wrote: | Not perfect, as you said, but definitely a start. |
As you seem to know more about this than me, do provide constructive criticism how to improve it and I'll follow through with any useful ideas thrown my way.
Do keep in mind that that's my decompiler, and has very little to do with the project this thread's about. Wouldn't hurt to discuss things on-topic, but then again, this IS the inter-web. :)
Speaking of which, I currently emulate 40 opcodes. Most of the non-implemented ones are simple addressing variations on the currently emulated ones, however, and it's just a matter of coding them in (for instance, LDA $1200, LDA $1200,Y, LDA $1200,X, LDA $45 etc..) At this rate, I'll have to start considering implementing hardware addresses soon enough. _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
tcaudilllg Dragonmaster
Joined: 20 Jun 2002 Posts: 1731 Location: Cedar Bluff, VA
|
Posted: Sun Aug 28, 2005 8:38 pm Post subject: |
[quote] |
|
I began a CPU core for the NES some years ago, and got pretty far on it... but I did it in QBasic. Once I realized its performance issues were insurmountable, I stopped work on it.
Decompilation is probably the best route to extending existing games. Like for example, making a massively multiplayer FFVI. (if it struck someone's fancy to do so) Far faster than recreating the project through reductive reasoning, as was done with Zelda Classic.
What would be most beneficial, although I'm sure it'd cause a few shockwaves in our world, would be first to decompile the game, and then to recompile it for a different architecture. One-time emulation. No speed issues at all.
But expect vigorous retaliation from untold quarters of the software industry if such a thing were to even begin taking place. That said... it's not as difficult as it sounds. It would take a lot of planning, to correspond the functions of a console to those of a PC, however emulators already do this. The only difference is that they don't reflect the correspondence in compiled code.
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Sun Aug 28, 2005 8:59 pm Post subject: |
[quote] |
|
What you're describing is called Dynamic Recompilation. The world has already thought of that a long time ago.
http://acorn.cybervillage.co.uk/emulation/dynrcomp.htm
Of course, this no longer has anything to do with ROM hacking. _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
Sirocco Mage
Joined: 01 Jun 2002 Posts: 345
|
Posted: Mon Aug 29, 2005 3:43 am Post subject: |
[quote] |
|
Quote: |
What would be most beneficial, although I'm sure it'd cause a few shockwaves in our world, would be first to decompile the game, and then to recompile it for a different architecture. One-time emulation. No speed issues at all.
|
Two points:
1. This has already been done -- by several dozen emulators.
2. The process is far from perfect, and although you generally see a tremendous speed boost after embracing dynamic recompilation, you won't be happy emulating a PS2 on your old Pentium 2 any time soon... or ever :)
.
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Mon Aug 29, 2005 4:22 am Post subject: |
[quote] |
|
Most of the more recent systems are emulated via dynamic recompilation. It would be too horrific to bear for anything but the top specs otherwise, as emulation is, after all, interpreted code otherwise.
The fact remains that even though the assembly code is recompiled from the chipset of the system being emulated to (say) x86 assembly, that does zilch for the hardware. Which is the most important part. Software can only get a system so far, hardware is where all of the emulation problems occure.
An example of this is the SNES math coprocessor. The way it works is pretty simple. Suppose you want to multiply two values together...
Code: | PHP ; Pushing the flags, to be clean.
SEP 0x20 ; Just for cleanness' sake, gonna ensure 8bit.
LDA 0x10 ; A = 16
STA $4202
LDA 0x05 ; A = 5
STA $4203
NOP
NOP
NOP
LDA $4217
PLP ; Restoring the flags
RTS |
Quick and useless little multiplication routine. What's this do? Takes 0x10 and 0x05, multiplies them together, and loads the result in A. How does it do this? This is where we hit a few problems. $4202 and $4203 are the multiplicand and multiplier registers. As soon as $4203 is written to, the arithmetic coprocessor begins to calculate the result of multiplying $4202 and $4203 together. This takes 8 machine cycles. Then, the result is placed in $4216/$4217 where it can be read back by the code. Of course, efficient code would use the extra cycles to do other stuff if possible instead of NOPing.
This has to be emulated. No matter how much recompiling you do, something has to play the part of the coprocessor and monitor $4203.
Now imagine VRAM behaving like this. Oh, wait, it does on pretty much every system. ;)
Edit: 63 opcodes. About a quarter of the way there. _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
tcaudilllg Dragonmaster
Joined: 20 Jun 2002 Posts: 1731 Location: Cedar Bluff, VA
|
Posted: Mon Aug 29, 2005 7:11 pm Post subject: |
[quote] |
|
Ignore the machine cycles. Ignore everything having to do with the machine cycles. You'll have to rearrange the code, though.
Code: |
PHP ; Pushing the flags, to be clean.
SEP 0x20 ; Just for cleanness' sake, gonna ensure 8bit.
LDA 0x10 ; A = 16
STA $4202
LDA 0x05 ; A = 5
STA $4203
LDA $4217
NOP
NOP ; As you said, the NOPs should be replaced with operations.
NOP ; We anticipate they are, and leave them in place.
PLP ; Restoring the flags
RTS
|
Perhaps in some cases this is ineffective? In that case, it is necessary to determine at what points the cycle times are themselves relevant to the program execution speed, and substitute an alternative timing system.
We know that LDA takes eight cycles. At which point, it would make sense for the programmer to try to get the result. A recompiler should anticipate this. After all, erasing/replacing a value before it is used makes little sense.
Not perfect, no, but in most cases effective.
|
|
Back to top |
|
|
RuneLancer Mage
Joined: 17 Jun 2005 Posts: 441
|
Posted: Tue Aug 30, 2005 12:28 am Post subject: |
[quote] |
|
That code is absolute crap. :) Even if you ignore machine cycles, which are there to prevent resource conflicts (among other vitally important things.)
In a best-case scenario, the arithmetic coprocessor will not have been used and you'll pull a 0, which will give incorrect results in most cases (generally when grabbing an index into a structure), but chances are you'll have called it at some other point in your code and the arithmetic coprocessor won't have overwritten the new value. (And in fact, I believe the SNES doesn't initialise these, so there'd be crap in it either way.)
That address contains VERY unreliable data until the 8 cycles have taken place. While your idea sounds good in theory, it will fail quite badly in practice. You can't fire off an LDA until after the 8 cycles have elapsed.
Basically, an instruction enters the processor via a bus called the "data bus." It gets shoved into the instruction register when the instruction fetch signal is fired and sits there. The TCU (timing control unit) is reset to 0 every time an instruction fetch is issued and increments at the beginning of every cycle until the required amount of cycles to execute the instruction have taken place. This whole thing serves as a means of pacing instructions so that no conflict occures (ie, an LDA and STA reading/writing to the same offset could lead to very, VERY unpredictable data if it could occure. Which it can't because of the TCU.) This also makes sure that coprocessor routines don't end up entering a deadlock or fighting for resources with processor routines. Such as the ALU putting the finishing touches on a multiplication.
As a result, the LDA will be fired off before the ALU finishes multiplying (as the LDA is issued before the 8 cycles are over) and it'll read what's essentially crap from the register, THEN the value will be written to it, but the LDA will have already been shifted out of the instruction register and the NOPs will be queued up all nice and snug.
You ARE thinking along the right lines, though: you could, say, LDA from some other offset (to retreive the stat ID to check after issuing a multiplication (CharID x CharStructureSize) to get the offset into the character's stats) in the meantime. But you can't do zilch with the ALU until it's done.
Edit: Here's an example of what I mean. I'll forego the flag manipulations, just pretend the accumulator's 8 bits and the index registers are 16 bits.
Code: | LDA $1200 ; Character ID
STA $4202
LDA 0x20 ; Character stats structure is 32 bytes per character.
STA $4203
LDA $15A0 ; ID of the stat we want to check.
ASL ; 2 bytes per stat.
CLC
ADC $4217 ; A = Char $1200 Stat $15A0 |
In this case, we do other stuff before pulling the value, while the coprocessor is busy with it. _________________ Endless Saga
An OpenGL RPG in the making. Now with new hosting!
|
|
Back to top |
|
|
|
Page 1 of 2 |
All times are GMT Goto page 1, 2 Next
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|