SGL replacement

RockinB

Staff member
Font improved.

I could find out that the strange font display problem (mentioned above) is only present in some programs, while it works correct in others.

Anyways, I've reported that snake crashes and I experienced that with other stuff, too. It crashes in the first call of malloc(), for example.

Now I looked around again and realized that I'm probably missing some initializations. At least the stack pointer, the SGL also sets GBR and SR. Okay, previously I wrote some dummy funcions SGL_Start and slStartSGL (in C file) which I could locate at section SLSTART (0x6004000) by using __attribute__ and which call main().

Setting the stack pointer (r15) could be done in an assembler file or maybe in a C file using asm keyword. My problems are that the first approach is not linked in the binary, although I declared .section SLSTART. The second attempt fails due to invalid opcode

asm("mov.l %[stack_pointer], r15": : [stack_pointer] "2" (0x6004000));

and I don't kow why.

How can I make the asm routines being linked first at SLSTART (0x6004000)? I know I could just specify the object file first for the linker, but the lib should come as a library...

I get a headache from such stuff :damn:

Originally posted by Reinhart@Fri, 2006-03-31 @ 06:54 PM

The ability to handle lots of sprite without bothering with memory management ...

[post=145387]Quoted post[/post]​


Is that a pro or a con? I would like to know what could have been done better in the SGL, in your oppinion.
 

RockinB

Staff member
Update: setting the stack pointer won't help me, since it defaults to a commonly use area.

Furthermore: I took vdp1ex from Charles MacDonald and inserted a malloc() call. (interesting to see what a lot of shit this requires to get it compiled)

Okay, running this example with this little modification will cause the same problem that I want to solve in my SGL replacement: invalid opcode in yabause (won't run on Saturn, too).

I haven't got much knowledge of newlib or any bullshit involved. I don't know what _main() is for, nor if newlib needs to be initialized somehow. Any of those bright guys may help me?
 

antime

Extra Hard Mid Boss
The default Newlib memory allocation routines assume a normal configuration where the heap grows upwards from the end of the code towards the end of memory, and the stack grows downwards from the end of memory. When growing the heap it checks against the stack pointer, and if they overlap Newlib tries to signal an error. The initial heap pointer is taken from the linker-provided symbol "end", and since you're trying to put the stack before the code this will always fail. To work around this problem you will have to provide your own implementation of the sbrk function (see /newlib/libc/sys/sh/syscalls.c).

Setting the stack pointer is most easily done as such:

Code:
register unsigned int StackPtr asm ("r15") = 0x6004000;
 

RockinB

Staff member
Hey antime,

thanks for your help. This really seems sound, I already had a look into syscalls.c (seems like one could make standard printf work on Saturn with this, by outputting it via commslink or serial).

However, I could not find a custom sbrk in the SGL disassembly. I'll try it out nevertheless.
 

antime

Extra Hard Mid Boss
It wouldn't be in SGL, because malloc is part of the C library. Besides, Sega offered their own memory allocation functions in SGL/SBL and mixing their use would not be a smart thing to do.
 

RockinB

Staff member
I was wrong about the stack pointer in SGL. MEMORY.TXT says two different things and in yabause, I could confirm that the stack pointer is set to be at the end of high memory until 0x60FFC00.

So there is no need for a custom sbrk.

Okay, so I tried to set the stack pointer somehow (2 ways described above):

Doing this in a function:

Code:
register unsigned int StackPtr asm ("r15") = 0x6004000;

Is optimized away. Declaring it global and assigning it in the function(or applying some inline assembler), results in the r15 being saved and restored after assignment.

I don't know if that can be prevented, otherwise this asignment is useless.

But I still have some assembler startup code (modified from Charles and the like), but still the file isn't even linked by the linker. You know this code isn't referenced elsewhere(since it's executed first), so that might be a reason. But when I tried to do it in a C file (with section attribute), it was linked.

Could it be that the C file includes other stuff which is referenced, but the SGL_CRT0.S does not?

Maybe I should place some required stuff inside the assembler file, so that it's linked.
 

antime

Extra Hard Mid Boss
Originally posted by Rockin'-B@Mon, 2006-04-03 @ 01:32 PM

Doing this in a function:

Code:
register unsigned int StackPtr asm ("r15") = 0x6004000;

Is optimized away.
"volatile" helps here too.

But I still have some assembler startup code (modified from Charles and the like), but still the file isn't even linked by the linker. You know this code isn't referenced elsewhere(since it's executed first), so that might be a reason. But when I tried to do it in a C file (with section attribute), it was linked.

Could it be that the C file includes other stuff which is referenced, but the SGL_CRT0.S does not?
If your entry point isn't called "start" you can specify it on the linker command line using the "-e" switch. That may help.
 

RockinB

Staff member
Originally posted by antime+Mon, 2006-04-03 @ 05:29 PM-->QUOTE(antime @ Mon, 2006-04-03 @ 05:29 PM)"volatile" helps here too.

[post=145451]Quoted post[/post]


[/b]


I thought so, too, but the compiler complained. Can't remember anymore if I tried "volatile register" or "register volatile", guess the first one.

@Mon, 2006-04-03 @ 05:29 PM

If your entry point isn't called "start" you can specify it on the linker command line using the "-e" switch. That may help.

[post=145451]Quoted post[/post]​



I used the standard SGL start entry: __Start

Anyways, I placed some required variables in the assembler file and whoops, it's being linked by the linker, at correct position.

So the stack pointer problem is solved, making snake work and others work a bit more.

I'm currently searching and removing some bugs, as you see. After finding out the position of the SGL VDP2 register buffer, I could dump that (even though yabause doesn't support vdp2 register dump directly) and compare to those set by SGLrep.

While doing, I added a couple of window related functions. All vdp2 registers are now equal, except for rotation scroll related stuff and cycle patterns (mine are even better, for the case tested).

Remaining probs are the wrong text problem (I could narrow the reason a bit, maybe it's gone, now) and some apps exit to multiplayer now.
 

RockinB

Staff member
Originally posted by Rockin'-B@Mon, 2006-04-03 @ 08:16 PM

Remaining probs are the wrong text problem (I could narrow the reason a bit, maybe it's gone, now) and some apps exit to multiplayer now.

[post=145453]Quoted post[/post]​


The wrong text prob was due to cycle patterns(how much I hate touching slScrAutoDisp, again :puke: ). Always to discover new(?) restrictions. Fortunately, all of my stuff which uses a bitmap scroll implies that cycle patterns are set correct :p .

Okay, the exit was just some controler stuff in SGLrep which made it think START+A+B+C is pressed. Behaved really strange. I dunno, it's not present any more (I changed a bit controler stuff, but that should be a different thing).

Okay finally, I could run my GBC emu port with SGL replacement :banana :rockin: :smash .

And guess what: even with low clock (26.8 instead of 28.6 MHz), it ran exactly as fast as the SGL version with higher clock! When disabling vblank waiting, it lost a bit of it's speed advantage. SGL apparently does some additional computation in slSynch(). Then I managed to implement the automatic clock change in SGL rep and so it won over the equivalent SGL setup!!!!

Though, the speed increase with higher clock is only about 7 percent, instead of 10 percent expected (anyone guess why?). And I'm having some garbage on the right screen side, where the 32 additional pixel per line appear.

I'm really happy to finally have it running some of my stuff.
 

ExCyber

Staff member
Though, the speed increase with higher clock is only about 7 percent, instead of 10 percent expected (anyone guess why?).
That speed change doesn't affect the memory speed, does it?
 

RockinB

Staff member
Originally posted by ExCyber@Tue, 2006-04-04 @ 04:34 PM

That speed change doesn't affect the memory speed, does it?

[post=145467]Quoted post[/post]​


AFAIK cpu and memory clock is the same (Ive read that the SCU DSP DMA transfer channel is clocked with 28 MHz).

Overall system clock should effect both SH2, VDP1, VDP2 and memory as well. Only the sound sub sustem, the CD block and the SCU DSP run at a different speed.

I measured the speed increase using original SGL and it is only slightly more(compared to SGLrep): 8 percent. My assumption is that's due to the difference of SMPC handling:

- in SGL, it's interrupt driven and the amount of CPU cycles stays the same when clock is increased

- in SGLrep, it's done explicitly every frame (currently SH2 direct mode) and due to increased framerate, the amount of CPU cycles needed for this increases.

Anyways, I've been experimenting with slavve usage in my GBC emu port before, sure I'd like to continue that with SGLrep. I managed to get a slave CPU example running :cheers , now I can start reimplementing the SGL slave handling!
 

RockinB

Staff member
The SGL replacement finally features slave CPU usage by slSlaveFunc(). This has been pretty tough, as my setup with CD-RW burning to test stuff is rather unsuited for such low level experiments. I'm very very satisfied with the result, I guess I did it at least as good as the guys did in the SGL.

A lot of compile flags customize the master-slave behaviour, like command buffer size and especially what to do if the buffer is full as well as cache-purging on the slave side.

I immediately experimented using the slave in GNUBoy using SGL replacement and it works fine (more than that, I measured the highest framerates ever). In GNUBoy rendering takes much more cpu time than everything else together. It's been countless times that I mentioned that GNUBoy's sourcecode is crap and that can be expierienced when using the slave: it uses global variables which causes problems when multiple processors work on the code.

So that prevents taking full advantage of the slave. In the end, I think it's better to port another emu, which is easier to use different cores with and then implement hardware rendering, like VBT did in his SMS emu port. Furthermore, on Saturn GNUBoy currently doesn't work with a lot of color games and demos.
 

vbt

Staff member
Originally posted by Rockin'-B@Tue, 2006-04-04 @ 04:45 PM

Though, the speed increase with higher clock is only about 7 percent, instead of 10 percent expected (anyone guess why?). And I'm having some garbage on the right screen side, where the 32 additional pixel per line appear.


Have a look at this function :

SPR_SetEraseData( 0x0000, 8-1, 0, 352-1, 224-1 );
 

RockinB

Staff member
Originally posted by vbt@Thu, 2006-04-06 @ 07:18 PM

Have a look at this function :

SPR_SetEraseData( 0x0000, 8-1, 0, 352-1, 224-1 );

[post=145527]Quoted post[/post]​


Today I had sgl replacement in action with VDP1 turned off, the green garbage on the right was still there. Actually, it changed color sometimes (I was using it in snes emu, which drew to framebuffer).

Don't know the reason, will have to investigate later.
 

RockinB

Staff member
I found the solution in MANSYS.TXT, the clockchange invalidates a lot of register and RAM areas. Just had to swap order of clock change and TV mode setting.
 
Top