NES emulation

That'd probably be insanely hard, if it was possible at all.

Aside from DMA timing (which I imagine would be a tremendous pain in the ass), is there anything else that would be a really tough issue?
 
Originally posted by Tagrineth@Nov 6, 2003 @ 11:02 PM

vbt, here's a thought... try to get multiprocessing running on the 68k (say, run the sound on it) before using the second SH-2, and see how that runs.

I don't follow. By 68k do you mean the saturn's sound processor (which is similiar to the genesis cpu)? And if so, do you mean to emulate all of the NES sound system on that cpu? :huh
 
Hey, the entire Speccy has been emulated on a 68k :cool:

In fact I think it's a pretty neat idea, but not that neat that it can replace one of the SH2's... that is, if you can find a useful use for dual CPUs in a emulator.
 
Originally posted by slinga@Nov 7, 2003 @ 11:56 AM

I don't follow. By 68k do you mean the saturn's sound processor (which is similiar to the genesis cpu)? And if so, do you mean to emulate all of the NES sound system on that cpu? :huh

Yes, I mean the Motorola 68EC000 that controls the sound chip in Saturn.

Hey, the entire Speccy has been emulated on a 68k

In fact I think it's a pretty neat idea, but not that neat that it can replace one of the SH2's... that is, if you can find a useful use for dual CPUs in a emulator.

See, 68k is surprisingly powerful and NES's sound... isn't.

Hell, Pagefault (of ZSNES fame) said it might be possible to run the SNES's famous SPC700 entirely on that 68k. And yes, he's done some Saturn work now.

And dual CPU's are insanely useful in emulation. More spread out processing resources = very good for emulation, since you're emulating several chips at once. Context switches aren't exactly good for performance.

And... I know it can't replace one of the SH-2's, BUT it's a lot easier according to pf to get "MP" running via the 68k and one SH-2 than via both SH-2's, and this could be all vbt needs to get speed up. It was just a suggestion to do first, since it's a good idea anyway, and if necessary he can still spread work onto the other SH-2... but they aren't called Twin Terrors for no reason.
 
Originally posted by antime@Nov 7, 2003 @ 04:48 PM

Mehh, what OS are you running to suffer from context switching penalties?

It's a given. Nothing to do with OS, it's part of CPU architecture.

I'd like to see you develop a CPU which can flush its pipeline and L1 cache, redirect pointers, and refill its pipeline in one cycle.
 
I'm with Antime on this one. CSPs only occur when you swap in\out processes\threads. I'd assume that VBT's code would be multithreaded (how else would you make use of the dual cpus?), but that each thread would run on its CPU. So 2 threads on 2 cpus = no switching processes.
 
Originally posted by Tagrineth@Nov 8, 2003 @ 01:34 AM

I'd like to see you develop a CPU which can flush its pipeline and L1 cache, redirect pointers, and refill its pipeline in one cycle.

But why would you do any of that? (And if you really want to know, there are CPUs which do zero-cycle context switches by storing the needed data on-chip. Obviously it only works for a set number of processes.)
 
Originally posted by antime+Nov 7, 2003 @ 06:19 PM--><div class='quotetop'>QUOTE(antime @ Nov 7, 2003 @ 06:19 PM)</div><div class='quotemain'> <!--QuoteBegin-Tagrineth@Nov 8, 2003 @ 01:34 AM

I'd like to see you develop a CPU which can flush its pipeline and L1 cache, redirect pointers, and refill its pipeline in one cycle.

But why would you do any of that? (And if you really want to know, there are CPUs which do zero-cycle context switches by storing the needed data on-chip. Obviously it only works for a set number of processes.) [/b][/quote]

Hmm, yeah, you're right, sorry... didn't research this quite enough before replying ^^;

Well, does SH-2 have these traits? I doubt it does, it's probably too old. And Slinga, I think the point was that vbt is so far only using one SH-2.
 
Originally posted by Tagrineth@Nov 8, 2003 @ 01:16 AM

Well, does SH-2 have these traits? I doubt it does, it's probably too old. And Slinga, I think the point was that vbt is so far only using one SH-2.

Context switches are rather painful on the SH-2. The cache is way too small to handle data from multiple threads/processes, exception processing is really slow because of the pipeline (though this isn't less of an issue if you do co-operative multi-tasking, especialy if you use delayed branch instructions), refilling the registers will take a while because of contention.
 
there are CPUs which do zero-cycle context switches by storing the needed data on-chip

Examples? I'd like to read some white papers on how this is put together (I suppose it's on server-class processors with huge virtual address spaces and they have some vitual address bits act as a context ID...)
 
On the contrary, most examples are specialized embedded controllers for tasks that need high throughput. Here's one such chip which supports eight threads.
 
I think you overestimate my programing level :blush: . I could do some parts by using 68k in assembly but I didn't do that for 7 years . Also I have no idea on how to do that with the Saturn 68k. (no tools, no assembler/compiler and I know no way to send code to this 68k )
 
if you need help with assembly, I'm your man (although i havent programmed much on the 68k I think I'll be able to adapt easily)
 
Also I have no idea on how to do that with the Saturn 68k. (no tools, no assembler/compiler and I know no way to send code to this 68k )

Compiler/assembler is anything that will let you produce a raw binary suited to the Saturn 68k memory map (not much to it IIRC, basically you're looking at RAM from 0x000000-0x07FFFF). To run it AFAIK you just have to upload the program to the beginning of SCSP RAM (including vector table) and use the SMPC to reset the 68K. Bart Trzynadlowski's page has some tools and sample programs IIRC.
 
Out of curiousity (I'm trying to learn a little about emulation), when are you drawing to the screen? During the Vblank period?
 
Ok vbt here is a little tips & tricks for you :) In the past 3 year I've mad SGE (Single Game Emulators) for the Saturn for my own use. Ok vbt use the SH-2's for the graphics and the Sautrn 68K for the sound. One of the SH-2's has to be used to store all the graphics and the other SH-2 is used for reading off the graphics. Simple read & write command. You was needin speed right? That's where the SH-2 that reads the graphics comes into use :) . See you got all of the graphics of the nes rom loaded onto one of the SH-2's not much memory used leaving some for speed!! Don't worrie about the sound cause the Saturn 68K is all you will need. One more thing after the play of each rom you need to make a command for the emu to reset the SH-2's memory so when another game is loaded it will not cause a problem. One more thing don't have the command write the emu data to the memory have it to be read off of the cd and for the if case when reset have it read the data off of the cd. I'll try to help you more if you want me to.
 
I had a little trouble understanding your recommendations. The only useful recommendation I could come up with is that you're saying to use have one of the SH-2 processors emulate the PPU by using its private RAM as PPU RAM in order to save on memory bandwidth. Is that what you're saying? The rest was pretty obvious (use the sound CPU for sound, initialize your memory to a known state before using it).

BTW, what games have you emulated?
 
:) ExCyber for a simple short answer yes thats what I was saying. If vbt could do this right he would have no problem emulating the nes on saturn at full speed. Games I've emulated where nes games and snes games(The Legend Of Zelda, The Adventure Of Link, The Legend Of Zelda A Link To The Past, Super Mario World) Each game ran about 97% full speed sound was not poor infact the sound was good no skipping nothing. Save Ram supported for both nes and snes.
 
Back
Top