There are many ROMs available that test an emulator for inaccuracies.
- NEStress partially tests PPU, CPU, and controller operation (old; some tests seem to always fail)
- Blargg's test ROMs partially test APU, misc PPU behavior, sprite 0 hit, and MMC3 operation. Refer to PPU_frame_timing for new information that the PPU ROMs test.
- nestest fairly thoroughly tests CPU operation. This is the best test to start with when getting a CPU emulator working for the first time. Start execution at $C000 and compare execution with a known-correct log.
- instr_test tests official and unofficial CPU instructions and lists which ones failed. It will work even if emulator has no PPU and only supports NROM, writing a copy of output to $6000 (see readme). This more thoroughly tests instructions, but can't help you figure out what's wrong beyond what instruction(s) are failing, so it's better for testing mature CPU emulators.
- instr_misc tests some miscellaneous aspects of instructions, including behavior when 16-bit address wraps around, and dummy reads.
- instr_timing tests timing of all instructions, including unofficial ones, page-crossing, etc.
- cpu_interrupts_v2 tests the behavior and timing of CPU in the presence of interrupts, both IRQ and NMI.
- cpu_reset tests CPU registers just after power and changes during reset, and that RAM isn't changed during reset.
- Sprite 0 Hit test ROMs
- Misc PPU Tests
- ppu_vbl_nmi tests the behavior and timing of the NTSC PPU's VBL flag, NMI enable, and NMI interrupt. Timing is tested to an accuracy of one PPU clock.
- PPU sprite overflow flag timing tests ($2002 bit 5), covering general operation, timing, and obscure pathological behavior (discussion).
- tvpassfail: NTSC color and NTSC/PAL pixel aspect ratio test ROM
- apu_test tests many aspects of the APU that are visible to the CPU. Really obscure things are not tested here.
- apu_mixer verifies proper operation of the APU's sound channel mixer, including relative volumes of channels and non-linear mixing. recordings when run on NES are available for comparison, though the tests are made so that you don't really need these.
- apu_reset tests initial APU state at power, and the effect of reset.
- volume_tests plays tones on all the APU's channels to show their relative volumes at various settings of $4011. Package includes a recording from an NES's audio output for comparison.
- mmc3_test tests the MMC3 scanline counter and IRQ generation, not much else currently
- BNTest tests how many PRG banks are reachable in BxROM and AxROM
- Balloon Fight relies on reading the nametables through $2007 to twinkle the stars in the background. (The code is at $D603.)
- Among the most popular NROM games, which are generally the first targets against which an emulator author tests his or her work, Super Mario Bros. is probably the hardest to emulate. It relies on JMP indirect, correct palette mirroring, sprite 0 detection, the 1-byte delay when reading from CHR ROM through $2007, and proper behavior of the nametable selection bits of $2000 and $2006. In addition, there are several bad dumps floating around, some of which were ripped from pirate multicarts whose cheat menus leave several key parameters in RAM.
- Adventures of Lolo 2, Spelunker rely on 1 cycle NMI delay when $2002 bit 7 gets set inside vblank (if $2002 has not been read yet)
- Battletoads needs precise CPU and PPU timing
- Bee 52 needs accurate DMC timing and relies on $2002 bit 5 as well
- Cobra Triangle, Iron Sword relies on the dummy read for the sta $4000,X instruction to acknowledge pending APU IRQs.
- Fire Hawk needs accurate DMC timing and does mid-frame palette changes
- Micro Machines requires correct values when reading PPU $2004 during rendering, and also relies on proper background color selection when rendering is disabled and the VRAM address points to the palette
- Ms. Pac-Man (Tengen) relies on being able to read $2002 bit 7 as true before NMI occurs
- Super Mario Bros. 3 relies on an interaction between the sprite priority bit and the OAM index to put power-ups behind blocks
- Slalom does a JSR while the stack pointer is 0, so that half of the return address ends up at $0100 and the other half at $01FF.
- Galaxian requires proper handling of bit 4 of the P register for /IRQ.
Game Bugs lists games that have glitches on NES hardware, so you won't go "fixing" them while breaking your emulator.
If a scroll split doesn't work, and a garbage sprite shows up around the intended split point, then the game is probably trying to use a sprite 0 hit, but either the wrong tile data is loaded or the background is scrolled to a position that doesn't overlap correctly. This could be a problem with nametable mirroring, with CHR bankswitching in mappers that support it, or with the CPU and PPU timing of whatever happened above the split. Battletoads, for one, uses 1-screen mirroring and requires exact timing to get the background scroll position dead-on.
It's best if your emulator can automatically run a suite of tests at the press of a button. This allows you to re-run them every time you make a change, without any effort. Automation can be difficult, because the emulator must be able to determine success/failure without your help.
Automating test ROMs that don't require any button presses is simplest. An emulator merely needs to run the ROM for sufficient time (at with an uncapped frame rate), then take a screenshot. If the screenshot differs from what it was the last time the test was run, it should make a note of this in the log. Later you can re-run any tests whose screenshots changed and examine the result yourself. Determining a changed screenshot doesn't even require any image files; the emulator can simply make a checksum of the picture, and compare with the previous.
Automating test ROMs that do require button presses is more complex. In many cases, the emulator can simply feed a series of button presses, with a fixed delay between each, perhaps one second. If the emulator has a movie feature which merely records controller input, you can simply record a movie of your pressing the appropriate buttons, then have the emulator play that back normally.
Automating game testing is the most complex. As with test ROMs that require button presses, playing back a movie might work. One difference is that the result won't be a simple screenshot at the end; it will potentially need to monitor images/sounds throughout the movie. This could still be a simple checksum. If a game's behavior changes, you can go back and compare it frame-by-frame with what played back on the previous version of your emulator.