views:

72210

answers:

11

How do emulators work? When I see NES / SNES or C64 emulators, it astounds me.

http://www.tommowalker.co.uk/snemzelda.png

Do you have to emulate the processor of those machines by interpreting its particular assembly instructions? What else goes into it? How are they typically designed?

Can you give any advice for someone interested in writing an emulator (particularly a game system)?

+502  A: 

Emulation is a multi-faceted area. Here are the basic ideas and functional components. I'm going to break it into pieces and then fill in the details via edits. Many of the things I'm going to describe will require knowledge of the inner workings of processors -- assembly knowledge is necessary. If I'm a bit too vague on certain things, please ask questions so I can continue to improve this answer.

Basic idea:

Emulation works by handling the behavior of the processor and the individual components. You build each individual piece of the system and then connect the pieces much like wires do in hardware.

Processor emulation:

There are three ways of handling processor emulation:

  • Interpretation
  • Dynamic recompilation
  • Static recompilation

With all of these paths, you have the same overall goal: execute a piece of code to modify processor state and interact with 'hardware'. Processor state is a conglomeration of the processor registers, interrupt handlers, etc for a given processor target. For the 6502, you'd have a number of 8-bit integers representing registers: A, X, Y, P, and S; you'd also have a 16-bit PC register.

With interpretation, you start at the IP (instruction pointer -- also called PC, program counter) and read the instruction from memory. Your code parses this instruction and uses this information to alter processor state as specified by your processor. The core problem with interpretation is that it's very slow; each time you handle a given instruction, you have to decode it and perform the requisite operation.

With dynamic recompilation, you iterate over the code much like interpretation, but instead of just executing opcodes, you build up a list of operations. Once you reach a branch instruction, you compile this list of operations to machine code for your host platform, then you cache this compiled code and execute it. Then when you hit a given instruction group again, you only have to execute the code from the cache. (BTW, most people don't actually make a list of instructions but compile them to machine code on the fly -- this makes it more difficult to optimize, but that's out of the scope of this answer, unless enough people are interested)

With static recompilation, you do the same as in dynamic recompilation, but you follow branches. You end up building a chunk of code that represents all of the code in the program, which can then be executed with no further interference. This would be a great mechanism if it weren't for the following problems:

  • Code that isn't in the program to begin with (e.g. compressed, encrypted, generated/modified at runtime, etc) won't be recompiled, so it won't run
  • It's been proven that finding all the code in a given binary is equivalent to the Halting problem

These combine to make static recompilation completely infeasible in 99% of cases. For more information, Michael Steil has done some great research into static recompilation -- the best I've seen.

The other side to processor emulation is the way in which you interact with hardware. This really has two sides:

  • Processor timing
  • Interrupt handling

Processor timing:

Certain platforms -- especially older consoles like the NES, SNES, etc -- require your emulator to have strict timing to be completely compatible. With the NES, you have the PPU (pixel processing unit) which requires that the CPU put pixels into its memory at precise moments. If you use interpretation, you can easily count cycles and emulate proper timing; with dynamic/static recompilation, things are a /lot/ more complex.

Interrupt handling:

Interrupts are the primary mechanism that the CPU communicates with hardware. Generally, your hardware components will tell the CPU what interrupts it cares about. This is pretty straightforward -- when your code throws a given interrupt, you look at the interrupt handler table and call the proper callback.

Hardware emulation:

There are two sides to emulating a given hardware device:

  • Emulating the functionality of the device
  • Emulating the actual device interfaces

Take the case of a hard-drive. The functionality is emulated by creating the backing storage, read/write/format routines, etc. This part is generally very straightforward.

The actual interface of the device is a bit more complex. This is generally some combination of memory mapped registers (e.g. parts of memory that the device watches for changes to do signaling) and interrupts. For a hard-drive, you may have a memory mapped area where you place read commands, writes, etc, then read this data back.

I'd go into more detail, but there are a million ways you can go with it. If you have any specific questions here, feel free to ask and I'll add the info.

Resources:

I think I've given a pretty good intro here, but there are a ton of additional areas. I'm more than happy to help with any questions; I've been very vague in most of this simply due to the immense complexity.

Obligatory Wikipedia links:

General emulation resources:

  • Zophar -- This is where I got my start with emulation, first downloading emulators and eventually plundering their immense archives of documentation. This is the absolute best resource you can possibly have.
  • NGEmu -- Not many direct resources, but their forums are unbeatable.

Emulator projects to reference:

  • IronBabel -- This is an emulation platform for .NET, written in Nemerle and recompiles code to C# on the fly. Disclaimer: This is my project, so pardon the shameless plug.
  • BSnes -- Awesome SNES emulator. You can read about it here.
  • MAME -- The arcade emulator. Great reference.
  • 6502asm.com -- This is a JavaScript 6502 emulator with a cool little forum.
  • dynarec'd 6502asm -- This is a little hack I did over a day or two. I took the existing emulator from 6502asm.com and changed it to dynamically recompile the code to JavaScript for massive speed increases.

Processor recompilation references:

  • The research into static recompilation done by Michael Steil (referenced above) culminated in this paper and you can find source and such here.

Addendum:

It's been well over a year since this answer was submitted and with all the attention it's been getting, I figured it's time to update some things.

Perhaps the most exciting thing in emulation right now is libcpu, started by the aforementioned Michael Steil. It's a library intended to support a large number of CPU cores, which use LLVM for recompilation (static and dynamic!). It's got huge potential, and I think it'll do great things for emulation.

emu-docs has also been brought to my attention, which houses a great repository of system documentation, which is very useful for emulation purposes. I haven't spent much time there, but it looks like they have a lot of great resources.

I'm glad this post has been helpful, and I'm hoping I can get off my arse and finish up my book on the subject by the end of the year/early next year.

Cody Brocious
This is already gearing up to be an epic answer. If you can point me to any resources as well at the end up of it would be appreciated. I'm looking at possibly the SNES or NES system to emulate and making it my semester project.
Simucal
Certainly. I'm going to assemble a nice list of resources. If you guys have any specific requests, I'll do my best to fill them.
Cody Brocious
In terms of requests, resources directly related to both emulator theory/design and more specific resources related to snes/nes systems. Because of the speed of modern systems would you say that writing a useable snes emulator using interpretation is possible?
Simucal
+1; can't wait for the next installment.
Adam Jaskiewicz
@Simucal, I don't see why not. The SNES is fairly simple, really.
Cody Brocious
I wish I could give this answer a +2 :) I keep checking back for more haha
Jason Coco
Are you an emulator programmer?
thenonhacker
I think this rates as the best answer I've ever read to the question "How do emulators work". Great stuff!What would be even more brilliant is an example of a simple emulator with explanatory notes :)
Leonard H Martin
@thenonhacker, The IronBabel project referenced in my resources section is mine. (The shameless plug is marked ;) )
Cody Brocious
@Cody Brocious, can't thank you enough. Best answer I have received on SO hands down.
Simucal
@Simucal, No problem. This is about all I can think up at the moment, but if you have any questions at all, I'll roll the answers in. Btw, Zophar has all the references you'll need for SNES/NES.
Cody Brocious
This is a model answer - great work Cody! =)
Erik Forbes
"It's been proven that finding all the code in a given binary is equivalent to the Halting problem" -- Reference please? Or should it be "It's been proven that finding all the code in *any* given binary is equivalent to the Halting problem"? Also can't access Steil's paper :-(
squelart
+1 cause that made my head hurt, its so far above my head I couldn't even see the air that it offset with the butterfly effect.
Unkwntech
Ioxp
Close to a gold badge from this great answer!
Simucal
Ok, it's the end of next year -- where's that book? ;) :P
RCIX
Something else that is interesting to note, is that some games have hardware IN the cartridge to provide additional cpu power or functionality. The only example that comes to mind right now is Dungeon Master on the snes, which had a custom sound chip in the cartridge itself to allow some fancier capabilities.
SLC
@SLC, This is very common with cartridge-based consoles, and it's a large part of their longevity as well. If you look at the NES, most games have 'mapper' hardware in them, which has to be emulated individually.
Cody Brocious
How are the NES/SNES or any consoles ROM files created? Are they generated by the manufacturers of the games?
Sev
Well, that depends. Generally, game developers will only create ROMs for their own purposes, e.g. distributing it with an emulator on the Wii's Virtual Console service. Outside of that, a ROM dumper is used, which is a piece of hardware to read all the data off the cartridge.
Cody Brocious
Interesting, thanks Cody, good to know.
Sev
+ 1 for THE SAKE OF + 1
Shaharyar
+1 for the most awesome answer I've read so far.
Stephen
+7  A: 

Yes, you have to interpret the whole binary machine code mess "by hand". Not only that, most of the time you also have to simulate some exotic hardware that doesn't have an equivalent on the target machine.

The simple approach is to interpret the instructions one-by-one. That works well, but it's slow. A faster approach is recompilation - translating the source machine codes to target machine codes. This is more complicated, as most instructions will not map one-on-one. Instead you will have to make elaborate workarounds that involve additional code. But in the end it's much faster. Most modern emulators do this.

Vilx-
The worst thing is missing documentation by far. It's when you find out that the modified Z80 core in the GameBoy Color has undocumented flag operations that the game your testing uses that you really start to lose faith.
Callum Rogers
+5  A: 

When you develop an emulator you are interpreting the processor assembly that the system is working on (Z80, 8080, PS CPU, etc.).

You also need to emulate all peripherals that the system has (video output, controller).

You should start writing emulators for the simpe systems like the good old Game Boy (that use a Z80 processor, am I not not mistaking) OR for C64.

Baget
C64 a "simple" system? While the 6510 is relatively simple (once you've covered the unlisted opcodes), the sound (SID) and video (VIC) chips are anything *but* simple. To achieve any decent level of compatibility, you'd need to emulate them - hardware bugs and all.
moobaa
+10  A: 

Emulation may seem daunting but is actually quite easier than simulating.

Any processor typically has a well-written specification that describes states, interactions, etc.

If you did not care about performance at all, then you could easily emulate most older processors using very elegant object oriented programs. For example, an X86 processor would need something to maintain the state of registers (easy), something to maintain the state of memory (easy), and something that would take each incoming command and apply it to the current state of the machine. If you really wanted accuracy, you would also emulate memory translations, caching, etc., but that is doable.

In fact, many microchip and CPU manufacturers test programs against an emulator of the chip and then against the chip itself, which helps them find out if there are issues in the specifications of the chip, or in the actual implementation of the chip in hardware. For example, it is possible to write a chip specification that would result in deadlocks, and when a deadline occurs in the hardware it's important to see if it could be reproduced in the specification since that indicates a greater problem than something in the chip implementation.

Of course, emulators for video games usually care about performance so they don't use naive implementations, and they also include code that interfaces with the host system's OS, for example to use drawing and sound.

Considering the very slow performance of old video games (NES/SNES, etc.), emulation is quite easy on modern systems. In fact, it's even more amazing that you could just download a set of every SNES game ever or any Atari 2600 game ever, considering that when these systems were popular having free access to every cartridge would have been a dream come true.

Uri
What are the differences between emulation and simulation?
Wei Hu
@Wei: Generally speaking, an emulator is supposed to behave "externally" like the system it emulates but there is nothing to say it has to be implemented in a similar way. A simulator is implemented in a way that mimics the simulated system, and as a result behaves like it.
Uri
When you see "Simulator" think its similar while an emulator "emulates"
mP
+37  A: 

A guy named Victor Moya del Barrio wrote his thesis on this topic. A lot of good information on 152 pages. You can download the PDF here.

If you don't want to register with scribd, you can google for the PDF title, "Study of the techniques for emulation programming". There are a couple of different sources for the PDF.

mdm
Very good resource, thanks!
Simucal
+1 very good resource
claws
A: 

Advice on emulating a real system or your own thing? I can say that emulators work by emulating the ENTIRE hardware. Maybe not down to the circuit (as moving bits around like the HW would do. Moving the byte is the end result so copying the byte is fine). Emulator are very hard to create since there are many hacks (as in unusual effects), timing issues, etc that you need to simulate. If one (input) piece is wrong the entire system can do down or at best have a bug/glitch.

acidzombie24
A: 

The Shared Source Device Emulator contains buildable source code to a PocketPC/Smartphone emulator (Requires Visual Studio, runs on Windows). I worked on V1 and V2 of the binary release.

It tackles many emulation issues: - efficient address translation from guest virtual to guest physical to host virtual - JIT compilation of guest code - simulation of peripheral devices such as network adapters, touchscreen and audio - UI integration, for host keyboard and mouse - save/restore of state, for simulation of resume from low-power mode

Barry Bond
+1  A: 

Also check out Darek Mihocka's Emulators.com for great advice on instruction-level optimization for JITs, and many other goodies on building efficient emulators.

Barry Bond
+2  A: 

Emulator are very hard to create since there are many hacks (as in unusual effects), timing issues, etc that you need to simulate.

For an example of this, see http://queue.acm.org/detail.cfm?id=1755886.

That will also show you why you ‘need’ a multi-GHz CPU for emulating a 1MHz one.

Someone
+5  A: 

I know that this question is a bit old, but I would like to add something to the discussion. Most of the answers here center around emulators interpreting the machine instructions of the systems they emulate.

However, there is a very well-known exception to this called "UltraHLE" (WIKIpedia article). UltraHLE, one of the most famous emulators ever created, emulated commercial Nintendo 64 games (with decent performance on home computers) at a time when it was widely considered impossible to do so. As a matter of fact, Nintendo was still producing new titles for the Nintendo 64 when UltraHLE was created!

For the first time, I saw articles about emulators in print magazines where before, I had only seen them discussed on the web.

The concept of UltraHLE was to make possible the impossible by emulating C library calls instead of machine level calls.

Rice Flour Cookies
+4  A: 

Having created my own emulator of the BBC Microcomputer of the 80s (type VBeeb into Google), there are a number of things to know.

  • You're not emulating the real thing as such, that would be a replica. Instead, you're emulating State. A good example is a calculator, the real thing has buttons, screen, case etc. But to emulate a calculator you only need to emulate whether buttons are up or down, which segments of LCD are on, etc. Basically, a set of numbers representing all the possible combinations of things that can change in a calculator.
  • You only need the interface of the emulator to appear and behave like the real thing. The more convincing this is the closer the emulation is. What goes on behind the scenes can be anything you like. But, for ease of writing an emulator, there is a mental mapping that happens between the real system, i.e. chips, displays, keyboards, circuit boards, and the abstract computer code.
  • To emulate a computer system, it's easiest to break it up into smaller chunks and emulate those chunks individually. Then string the whole lot together for the finished product. Much like a set of black boxes with inputs and outputs, which lends itself beautifully to object oriented programming. You can further subdivide these chunks to make life easier.

Practically speaking, you're generally looking to write for speed and fidelity of emulation. This is because software on the target system will (may) run more slowly than the original hardware on the source system. That may constrain the choice of programming language, compilers, target system etc.
Further to that you have to circumscribe what you're prepared to emulate, for example its not necessary to emulate the voltage state of transistors in a microprocessor, but its probably necessary to emulate the state of the register set of the microprocessor.
Generally speaking the smaller the level of detail of emulation, the more fidelity you'll get to the original system.
Finally, information for older systems may be incomplete or non-existent. So getting hold of original equipment is essential, or at least prising apart another good emulator that someone else has written!

Jonathan Swift