XL2
Established Member
I tried something to reduce memory usage and reduce the number of processes vertices, but it didn't work out too well so far, so hopefully someone can suggest me something to speed up the whole process before I just move on to another technique :
As mentionned in this post, I imported Quake maps in my engine for testing, and I subdivided all the maps in grids and planes (in other words, you have tiles of quads, but these quads all face the same direction and they are all located in the same grid-square from the global map coordinates).
This allows me to do aggressive culling, which is needed on the Saturn.
But this subdivision duplicates several vertices since it's how it works with SGL (each object containing its own vertices), to the point where it's be too much for the Saturn and it's too much for the default SGL workarea.
Long story short, I tried to generate a PDATA (the 3d mesh) on the fly by using essentialy lookup tables to determine if a verticle is already used, and change the quad's vertices reference with that lookup table.
The number of processed vertices is reduced a lot, but the whole process is much slower (which was expected, but not that bad).
I guess there might be a way to speed it up by making better use of the CPU cache or find a way to make DMA practical (it's not right now, with nothing larger than 12 bytes).
Now, I'm considering to revert back to the old technique of just using static PDATA and hope that with better hidden surface determination I can keep everything to managable levels, but I will still try a few more things before fully ditching the current technique.
Any suggestions on how to speed up the whole memory transfer/lookup ?

As mentionned in this post, I imported Quake maps in my engine for testing, and I subdivided all the maps in grids and planes (in other words, you have tiles of quads, but these quads all face the same direction and they are all located in the same grid-square from the global map coordinates).
This allows me to do aggressive culling, which is needed on the Saturn.
But this subdivision duplicates several vertices since it's how it works with SGL (each object containing its own vertices), to the point where it's be too much for the Saturn and it's too much for the default SGL workarea.
Long story short, I tried to generate a PDATA (the 3d mesh) on the fly by using essentialy lookup tables to determine if a verticle is already used, and change the quad's vertices reference with that lookup table.
The number of processed vertices is reduced a lot, but the whole process is much slower (which was expected, but not that bad).
I guess there might be a way to speed it up by making better use of the CPU cache or find a way to make DMA practical (it's not right now, with nothing larger than 12 bytes).
Now, I'm considering to revert back to the old technique of just using static PDATA and hope that with better hidden surface determination I can keep everything to managable levels, but I will still try a few more things before fully ditching the current technique.
Any suggestions on how to speed up the whole memory transfer/lookup ?
Code:
void COPY_POINT(POINT source, POINT dest)
{
dest[X]=source[X];
dest[Y]=source[Y];
dest[Z]=source[Z];
}
void ADD_POINT(unsigned short i)
{
COPY_POINT(VDATA.pntbl[i], LevelMesh.pntbl[LevelMesh.nbPoint]); //Copies the vertices from the global list to the generated PDATA
VDATA.WRAM_LUT[i]=LevelMesh.nbPoint; //The indexed value where the vertices is stored
++LevelMesh.nbPoint;
}
void COPY_PDATA(unsigned int i) //It's called for each plane after culling out those not needed
{
register unsigned int T;
short *WRAM_LUT=VDATA.WRAM_LUT; //The lookup table for indexing vertices
_QDATA * curQuad = QDATA[i]; //The quad data, containing the texture no and the quads' 4 points (pointing to the global vertices list)
POLYGON * curPol;
unsigned short * curVert;
FIXED curNorm[XYZ];
curNorm[X] = PLANE[i]->norm[X]; curNorm[Y] = PLANE[i]->norm[Y]; curNorm[Z] = PLANE[i]->norm[Z]; //I keep the normals per plane to reduce RAM usage
for (T=0; T<QDATA[i]->nbPolygon; ++T)
{
curVert= curQuad->Vertices[T];
curPol = &LevelMesh.pltbl[LevelMesh.nbPolygon];
if (WRAM_LUT[curVert[0]]== -1) ADD_POINT(curVert[0]);
if (WRAM_LUT[curVert[1]]== -1) ADD_POINT(curVert[1]);
if (WRAM_LUT[curVert[2]]== -1) ADD_POINT(curVert[2]);
if (WRAM_LUT[curVert[3]]== -1) ADD_POINT(curVert[3]);
curPol->Vertices[0] = WRAM_LUT[curVert[0]];
curPol->Vertices[1] = WRAM_LUT[curVert[1]];
curPol->Vertices[2] = WRAM_LUT[curVert[2]];
curPol->Vertices[3] = WRAM_LUT[curVert[3]];
curPol->norm[X] = curNorm[X]; curPol->norm[Y] = curNorm[Y]; curPol->norm[Z] = curNorm[Z];
LevelMesh.attbl[LevelMesh.nbPolygon]=ATTRIBUTE_LIST[QDATA[i]->Texture_ID[T]];
++LevelMesh.nbPolygon;
}
}


Last edited: