Some technical questions

RockinB

Established Member
How long is the vblank period, between vblank-in and vblank-out interrupt? This is the time when we can savely access VRAM, so it's good to know how much computation can be done during that.

From the discussion about PAL-progressive video output of modern DVD players (or from photos taken from TV screens) you know, that one frame is only shown by half, followed by the other half. Is that true, even for NTSC? If so, why don't we just let the VDP1 render only the top half and then render the bottom half? This would assume that the drawing is in synch with TV update and would most likely only be a senseful option when running at full speed, e.g. 50 FPS for PAL and 60 FPS for NTSC.

The goal is to reduce the load on the VDP1 for increasing the number of frames per second. One can set the framebuffer write & erase area manually.

It is said that with the VDP1 color calculations (gouraud...) are only possible with RGB format character pattern. Could it be possible with color lookup table format, too? Because the lookup table can contain both, RGB and palette pixel format.

Using double interlace in VDP1 uses both framebuffer at the same time, each for writing to from VDP1 and reading from with VDP2. So what about VRAM access conflicts between both video processors? When would the erase take place?

The high resolution mode of VDP1 is IIRC only possible using 8 bits per pixel palette format. So how can these high resolution fighting games use gouraud shading? Or did they use double interlace instead? Or am I slightly wrong about that....

Why can't the SCU perform burst read (DMA?) from the VRAM (possibly including framebuffers)? Can instead the CPU perform DMA read access in these areas?

Thanks guys!
 
Originally posted by Rockin'-B@Sun, 2005-08-21 @ 05:42 PM

From the discussion about PAL-progressive video output of modern DVD players (or from photos taken from TV screens) you know, that one frame is only shown by half, followed by the other half. Is that true, even for NTSC? If so, why don't we just let the VDP1 render only the top half and then render the bottom half? This would assume that the drawing is in synch with TV update and would most likely only be a senseful option when running at full speed, e.g. 50 FPS for PAL and 60 FPS for NTSC.
Even though the fields are sometimes called "upper" and "lower" they contain every other scanline and span the whole screen height. If you want to display different images for each field, use the noninterlace mode (double-interlace mode would work as well, but be a bit pointless).

It is said that with the VDP1 color calculations (gouraud...) are only possible with RGB format character pattern. Could it be possible with color lookup table format, too? Because the lookup table can contain both, RGB and palette pixel format.
You can use Gouraud shading with palettized textures, but it will interpolate the texture index values, not the actual colours. The same applies for transparency, shadow and half-brightness.

Using double interlace in VDP1 uses both framebuffer at the same time, each for writing to from VDP1 and reading from with VDP2. So what about VRAM access conflicts between both video processors? When would the erase take place?
When using double-interlace you're still only accessing the "back buffer". The only difference is that one framebuffer will hold the even-numbered lines and the other the odd-numbered ones. Apart from setting the odd/even bit and essentially drawing the same screen twice, handling is the same as for other modes.

The high resolution mode of VDP1 is IIRC only possible using 8 bits per pixel palette format. So how can these high resolution fighting games use gouraud shading? Or did they use double interlace instead? Or am I slightly wrong about that....
I don't remember at least VF2 using Gouraud shading. You can do it with carefully designed palettes, but more probable would be that only the vertical resolution was increased.

Why can't the SCU perform burst read (DMA?) from the VRAM (possibly including framebuffers)? Can instead the CPU perform DMA read access in these areas?
IIRC it will interfere with RAM refresh. You can use CPU (non-burst) DMA access anywhere you can access manually; it operates the same.
 
Thanks for your response!

So the upper and lower field just contain either even or odd lines. The double interlace mode should thus be high quality. But if the game does not run at 50/60 Hz this would mean the same image for even and odd lines for a couple of fields/frames. So using double interlace would be best at full speed.

About using color calculations with character patterns made using a color lookup table....it could be possible, maybe. The 16 color lookup table is an excellent method for saving texture memory. AND for assigning different priorities and different color calculation registers for each single pixel in palette mode.

I've once again read the CHROME demo of SGL. Indeed it does enable to put highlights into the interior of polygons, Which otherwise would only be possible by increasing polygon count.

The author furthermore says, that it renders faster than in RGB mode. But I'm not sure if that's right and for what reason.

Using this method is really crazy and difficult. I wonder if one could really efficiently exploit this, not only because of the lack of tools.
 
Originally posted by Rockin'-B@Mon, 2005-08-22 @ 01:23 PM

So the upper and lower field just contain either even or odd lines. The double interlace mode should thus be high quality. But if the game does not run at 50/60 Hz this would mean the same image for even and odd lines for a couple of fields/frames. So using double interlace would be best at full speed.
Yes, you have to pretty much design your game around never missing the frame buffer swap if you use that mode.

About using color calculations with character patterns made using a color lookup table....it could be possible, maybe.
Take a look at late 80s and early 90s DOS demos for examples of what you can do with palette manipulation.

The author furthermore says, that it renders faster than in RGB mode. But I'm not sure if that's right and for what reason.
Could very well be, but depends on the internal design of VDP1. Using RGB textures you need to read two bytes per source pixel, but only one when using VDP2-palettized textures. VDP1-palettized textures would need three reads (index and colour definition).

By the way, I checked with Charles MacDonald's VDP1 doc, and the shadow and half-transparency modes will not change palette values in the framebuffer. Instead, shadow will skip the pixel and half-transparency will write over it.
 
I could verify in the VDP1 manual that all color calculations can be done in color lookup table mode, too. This includes the palette gouraud shading tricks, although the manual says that the result cannot be guaranteed. Hopefully this doesn't mean that there are different VDP1 revisions with some of them acting different for this case.

I'm interested in performance benchmarks of the different drawing modes to find an optimal solution.

The only things I found are:

- "shadow and half-transparency take 6 times longer, than without color calculation"

- "pre clipping takes up to 5 CPU cycles per line"

Originally posted by antime+Mon, 2005-08-22 @ 08:24 PM--><div class='quotetop'>QUOTE(antime @ Mon, 2005-08-22 @ 08:24 PM)</div><div class='quotemain'>Could very well be, but depends on the internal design of VDP1. Using RGB textures you need to read two bytes per source pixel, but only one when using VDP2-palettized textures. VDP1-palettized textures would need three reads (index and colour definition).

[post=138578]Quoted post[/post]​

[/b]


So the only reason for speedup is due to reduced access to the VDP1 VRAM.

BTW: This trick allows textured polygons by drawing non-textured polygons (further speedup), by using a palette code for polygon color + gouraud shading. But the restriction of the result is even bigger.

<!--QuoteBegin-antime
@Mon, 2005-08-22 @ 08:24 PM

By the way, I checked with Charles MacDonald's VDP1 doc, and the shadow and half-transparency modes will not change palette values in the framebuffer. Instead, shadow will skip the pixel and half-transparency will write over it.

[post=138578]Quoted post[/post]​

[/quote]

Yes, that allows color calculations to be used on a framebuffer with mixed content, RGB and palette pixels.

A quite usefull graphic can be found in the VDP1 manual, page 111.
 
Originally posted by Rockin'-B@Tue, 2005-08-23 @ 11:22 AM

So the only reason for speedup is due to reduced access to the VDP1 VRAM.
Well, that's the best reason I can think of. The extra cost of shadow/half-transparency should give some idea about how much those avoided reads are worth. If VDP1 is smart about read texture values, it can save even more cycles by reading more than one source pixel at a time. Possibly it also reads the palette to an internal buffer which would mean a small fixed overhead but overall large savings for VDP1-textured parts. If you have the time and inclination you could try to measure the time needed to draw different texture formats.
 
You can use Gouraud shading with palettized textures, but it will interpolate the texture index values, not the actual colours.
Isn't there one palette mode that actually uses a CLUT stored in VDP1 VRAM to render RGB to the framebuffer, allowing you to do both palettized textures and color calculation? I'm pretty sure I saw this somewhere...
 
Questions remaining:

How long is the V-Blank period?

How long is the H-Blank period?

BTW: it's goot to know that the clock speed of the VDP1, the CPU and maybe the whole system is significantly higher (> 28 MHz instead of < 27 MHz) when using horizontal resolution 352 or 704 instead of 320 or 640.

Originally posted by ExCyber@Thu, 2005-08-25 @ 09:49 AM

Isn't there one palette mode that actually uses a CLUT stored in VDP1 VRAM to render RGB to the framebuffer, allowing you to do both palettized textures and color calculation? I'm pretty sure I saw this somewhere...

[post=138716]Quoted post[/post]​


VDP1 supports color modes 16bit RGB and palette (stored inside VDP2). Additionally, VDP1 supports CLUT, which is a table in VDP1 VRAM with 16 entries each having 16 bits. In CLUT, the VDP1 DOES NOT CARE OF THE PIXEL FORMAT, it just writes it as is into the framebuffer (which is of interest for palette, read on).

So the normal palette mode allows to set priority and VDP2 color calculation bits PER SPRITE, while the advantage of using a CLUT with palette entries is that it allows that PER PIXEL.

Color calculations are possible on RGB mode and CLUT with table containing RGB pixels. As we know from the CHROME demo, it's possible to inperpolate palette indices using red gouraud shading in palette mode and in CLUT mode with table having palette entries.
 
Reading the DTS documents I've found some further statements about VDP1 drawing speeds:

Q: What is the performance penalty for using the VDP1 Gouraud shading hardware?

A: A measurement of the time required to draw several hundred

non-textured polygons with and without Gouraud shading indicated

that using Gouraud shading slows down the VDP1 about a third.

It's not clear if the penalty is any different when using textures.

Note that if all you want is a cheap, flat-shaded lightning model,

you can do it without incurring a performance hit by making your sprites paletted

and enabling color calculations between the sprites and the line screen.

Each paletted sprite can select it's own color calculation ratio,

giving up to eight levels of brightness or dimness.


BTW: I've just got SCU indirect DMA to work.
 
Another quote, from Charles MacDonald's VDP1 hardware notes:

The Replace, Gouraud, and Half-luminance modes all take the same amount of time. Gouraud takes slightly more, but only by a few cycles - this is most likely due to the shading table being read, and not from any per-pixel processing which would have a larger impact on timing.

I assume that Charles' timings were performed with textured polygons. The difference in the performance hit could be explained by Sega's DTS using very small primitives, making the reading of the shading table a significant overhead, or some internal pipelining where the per-pixel calculation cost is absorbed by the cost of reading the source texels. It would however be slightly odd to use small polygons to test fillrate, so providing the DTS figure is correct I'm leaning towards the latter explanation.
 
Here is a nice thing I discovered:

Less restricted hires display:

When using high resolution mode

(horizontal resolution of 640 or 704)

the VDP2 manual says you'll have to use NGB0 and NGB1 along with a couple of other restrictions.

Well, I don't know if that is a translation error from japanese original (it would imply to be able to use not more than one NBG scroll), because it works with only one NGB, too.

Isn't that cool?

Another thing that's interesting:

You can set different resolutions for VDP1 and VDP2. The most coolest about this is, that you are not forced to use a 8bpp palette only framebuffer for VDP1 when using hi res display. That's what all fighting games do.

Question:

There are 3 modes: non-interlaced, single interlaced and double interlaced. The SGL does not provide to set this explicitly, so it must be implied in some other settings like screen resolution. Anyone knows something about this?
 
Back
Top