Ponut
Gear Supporter
I just wanted to share \ document this to save someone else some time in the future.
More than likely, no one to whom this information applies to will read it, but I digress 🙂
Brief
I studied two (actually three, but the third did not work enough to test) similar methods to remove duplicate vertices from real-time polygon subdivision.
Note these systems never produced duplicate polygons, and some vertices were never duplicated.
The conclusion was, even if you precalculate where the duplicates will be generated and avoid making them in the first place, it is slower to process the potential of a duplicate than it is to just make and process all duplicates vertices.
Why?
The basic true answer is that the way I was checking for the duplicates was not worth the opportunity cost of a screenspace transform of a duplicate.
Considering I did the math and vertices could be duplicated over 32 times, this was surprising, but most vertices would only be duplicated up to 16 times. The average is probably between two and three.
To truly understand what happened, I need to expose the basic principle function which subdivides a tile, and what a tile is
Structures in Question
The data format is literally as follows (without any precomputed duplicate removal tables):
The important hierarchy is there are planes (which are polygons), that are tessellated (pre-subdivided) into tiles, and those tiles are subdivided into more tiles (polygons).
This discussion is about the vertices generated from subdividing those tiles, which is done at runtime.
This discussion is not about precomputed subdivision; this is completely viable and as noted earlier is something the test software already did.
Methods in Question
The primary method is the naive one: Presuppose nothing but the duplicates each tile knows about within itself and where it came from.
One iteration of that code looks like this (bearing in mind it is called recursively):
The other methods explored were pre-computed subdivision lists and runtime hashtables:
This is the live hashtables variant:
As you can see, that's a fair bit more complicated than the naive method.
And then I had the method using precomputed tables:
To get a comparison of the complexity increase versus the complexity decrease, we are fighting with transformation of more vertices.
I ran out of space for text in the post so I'l just say... that's a very simple process, especially since these vertices are already in view-space.
Note that it checks if a vertex is already screenspace in case a tile was not subdivided at all (too far away).
What situation would make me (or you) think that decreasing duplicate vertices is worth the effort?
In the Saturn demo program seen in ``demo.zip``:
First note is you have to press START in the blue screen when it's finished loading everything (after the negative number pops up).
Then you can open the debug menu (through the start menu) to show info text. That will let you see the engine's working vertex count.
What sent me down this spiral was seeing that the vertex counts were generally double that of the polygon counts; if the vertex duplicates were removed, it should only be higher by... well, let me explain THAT amount.
A grid of vertices given the size (x * y) will produce a polygon grid with ((x-1) * (y-1)) polygons in it. This isn't double the size, it might be 10% more vertices than polygons in some cases.
I resolved myself to find out if it would be faster to remove duplicates insofar as I could remove them, and that is on a per-plane basis. I couldn't remove duplicates between planes as that'd need buffers too large (need to store vertices from all planes in sector; not viable). Well, it simply turned out that the basic ways I could think to do this failed.
What WOULD work / optimize this?
The last logical optimization I can think of is to precompute the subdivision pathway to every polygon, checking in at stop points along the way (depth) rather than doing subdivision in a simple recursive path. This seems like an impractical increase in code size & complexity, since there would need to be a logic path created for every possible sequence of subdivisions in an entire plane, of which there are many.
Besides that I may finally be at a point where I must pursue computational optimizations, e.g., assembly.
More than likely, no one to whom this information applies to will read it, but I digress 🙂
Brief
I studied two (actually three, but the third did not work enough to test) similar methods to remove duplicate vertices from real-time polygon subdivision.
Note these systems never produced duplicate polygons, and some vertices were never duplicated.
The conclusion was, even if you precalculate where the duplicates will be generated and avoid making them in the first place, it is slower to process the potential of a duplicate than it is to just make and process all duplicates vertices.
Why?
The basic true answer is that the way I was checking for the duplicates was not worth the opportunity cost of a screenspace transform of a duplicate.
Considering I did the math and vertices could be duplicated over 32 times, this was surprising, but most vertices would only be duplicated up to 16 times. The average is probably between two and three.
To truly understand what happened, I need to expose the basic principle function which subdivides a tile, and what a tile is
Structures in Question
The data format is literally as follows (without any precomputed duplicate removal tables):
C:
/////////////////////////////////
// Sector Data
/////////////////////////////////
typedef struct
{
entity_t * ent;
unsigned short nbPolygon; //# of polygons (planes) in the sector (for collision)
unsigned short nbPoint; //# of points in the sector (for collision)
unsigned short nbTileVert; //# of vertices for the tiles (to draw) in the sector
unsigned short nbAdjacent; //# of sectors which are adjacent to this sector
unsigned short * nbTile; //# of tiles (polygons to draw) in each plane of sector, of size <nbPolygon>
unsigned short * plStart; //Starting tile # in each plane of sector, of size <nbPolygon>
unsigned short * pltbl; //Stores the polygon IDs from <entity> which are in this sector; of size <nbPolygon>
unsigned short * pntbl; //Stores the vertex IDs from <entity> which are in this sector; of size <nbPoint>
unsigned short * adtbl; //Stores the sector IDs from <entitiy> which are adjacent to this sector; of size <nbAdjacent>
unsigned short * altbl; //Stores the polygon ID alias of the tiles from <entity> in the sector; of size <nbTile>
_quad * tltbl; //Stores the sector-specific vertex IDs used to draw the sector's tiles
POINT * tvtbl; //Stores the sector-specific vertices used to draw the sector's tiles
} _sector;
The important hierarchy is there are planes (which are polygons), that are tessellated (pre-subdivided) into tiles, and those tiles are subdivided into more tiles (polygons).
This discussion is about the vertices generated from subdividing those tiles, which is done at runtime.
This discussion is not about precomputed subdivision; this is completely viable and as noted earlier is something the test software already did.
Methods in Question
The primary method is the naive one: Presuppose nothing but the duplicates each tile knows about within itself and where it came from.
One iteration of that code looks like this (bearing in mind it is called recursively):
C:
void subdivide_xy(_subdivision_settings * set)
{
//////////////////////////////////////////////////////////////////
// Subdivide by all rules / Subdivide polygon into four new quads
//Turn 4 points into 9 points
//Make the 4 new polygons
//////////////////////////////////////////////////////////////////
/*
0A 1A | 0B 1B
A B
3A 2A | 3B 2B
0D 1D | 0C 1C
C D
3D 2D | 3C 2C
*/
// Initial Conditions
subdivided_polygons[set->poly_a][0] = subdivided_polygons[set->poly_a][0];
subdivided_polygons[set->poly_b][1] = subdivided_polygons[set->poly_a][1];
subdivided_polygons[set->poly_d][2] = subdivided_polygons[set->poly_a][2];
subdivided_polygons[set->poly_c][3] = subdivided_polygons[set->poly_a][3];
// Center
//
subdivided_points[set->tgt_pnt][X] = (set->ptv[0][X] + set->ptv[1][X] +
set->ptv[2][X] + set->ptv[3][X])>>2;
subdivided_points[set->tgt_pnt][Y] = (set->ptv[0][Y] + set->ptv[1][Y] +
set->ptv[2][Y] + set->ptv[3][Y])>>2;
subdivided_points[set->tgt_pnt][Z] = (set->ptv[0][Z] + set->ptv[1][Z] +
set->ptv[2][Z] + set->ptv[3][Z])>>2;
subdivided_polygons[set->poly_a][2] = set->tgt_pnt;
subdivided_polygons[set->poly_b][3] = set->tgt_pnt;
subdivided_polygons[set->poly_d][0] = set->tgt_pnt;
subdivided_polygons[set->poly_c][1] = set->tgt_pnt;
set->tgt_pnt++;
// 0 -> 1
subdivided_points[set->tgt_pnt][X] = (set->ptv[0][X] + set->ptv[1][X])>>1;
subdivided_points[set->tgt_pnt][Y] = (set->ptv[0][Y] + set->ptv[1][Y])>>1;
subdivided_points[set->tgt_pnt][Z] = (set->ptv[0][Z] + set->ptv[1][Z])>>1;
subdivided_polygons[set->poly_a][1] = set->tgt_pnt;
subdivided_polygons[set->poly_b][0] = set->tgt_pnt;
set->tgt_pnt++;
// 1 -> 2
subdivided_points[set->tgt_pnt][X] = (set->ptv[2][X] + set->ptv[1][X])>>1;
subdivided_points[set->tgt_pnt][Y] = (set->ptv[2][Y] + set->ptv[1][Y])>>1;
subdivided_points[set->tgt_pnt][Z] = (set->ptv[2][Z] + set->ptv[1][Z])>>1;
subdivided_polygons[set->poly_b][2] = set->tgt_pnt;
subdivided_polygons[set->poly_d][1] = set->tgt_pnt;
set->tgt_pnt++;
// 3 -> 2
subdivided_points[set->tgt_pnt][X] = (set->ptv[2][X] + set->ptv[3][X])>>1;
subdivided_points[set->tgt_pnt][Y] = (set->ptv[2][Y] + set->ptv[3][Y])>>1;
subdivided_points[set->tgt_pnt][Z] = (set->ptv[2][Z] + set->ptv[3][Z])>>1;
subdivided_polygons[set->poly_d][3] = set->tgt_pnt;
subdivided_polygons[set->poly_c][2] = set->tgt_pnt;
set->tgt_pnt++;
// 3 -> 0
subdivided_points[set->tgt_pnt][X] = (set->ptv[0][X] + set->ptv[3][X])>>1;
subdivided_points[set->tgt_pnt][Y] = (set->ptv[0][Y] + set->ptv[3][Y])>>1;
subdivided_points[set->tgt_pnt][Z] = (set->ptv[0][Z] + set->ptv[3][Z])>>1;
subdivided_polygons[set->poly_a][3] = set->tgt_pnt;
subdivided_polygons[set->poly_c][0] = set->tgt_pnt;
set->tgt_pnt++;
sub_vert_cnt = set->tgt_pnt;
sub_poly_cnt += 3; //Only add 3, as there was already 1 polygon. It was split into four.
}
The other methods explored were pre-computed subdivision lists and runtime hashtables:
This is the live hashtables variant:
C:
void undupe_sub_xy(_subdivision_settings * set)
{
//////////////////////////////////////////////////////////////////
// Subdivide by all rules / Subdivide polygon into four new quads
//Turn 4 points into 9 points
//Make the 4 new polygons
//////////////////////////////////////////////////////////////////
/*
0A 1A | 0B 1B
A B
3A 2A | 3B 2B
0D 1D | 0C 1C
C D
3D 2D | 3C 2C
*/
// Initial Conditions
subdivided_polygons[set->poly_a][0] = subdivided_polygons[set->poly_a][0];
subdivided_polygons[set->poly_b][1] = subdivided_polygons[set->poly_a][1];
subdivided_polygons[set->poly_d][2] = subdivided_polygons[set->poly_a][2];
subdivided_polygons[set->poly_c][3] = subdivided_polygons[set->poly_a][3];
// Center
//
set->pot[X] = (set->ptv[0][X] + set->ptv[1][X] +
set->ptv[2][X] + set->ptv[3][X])>>2;
set->pot[Y] = (set->ptv[0][Y] + set->ptv[1][Y] +
set->ptv[2][Y] + set->ptv[3][Y])>>2;
set->pot[Z] = (set->ptv[0][Z] + set->ptv[1][Z] +
set->ptv[2][Z] + set->ptv[3][Z])>>2;
int raw_hash = (set->pot[X] + set->pot[Y] + set->pot[Z])>>18;
unsigned short possible_dupe = hash_signed_offet[raw_hash];
if(possible_dupe != INVALID_PLANE)
{
subdivided_points[set->tgt_pnt][X] = set->pot[X];
subdivided_points[set->tgt_pnt][Y] = set->pot[Y];
subdivided_points[set->tgt_pnt][Z] = set->pot[Z];
subdivided_polygons[set->poly_a][2] = set->tgt_pnt;
subdivided_polygons[set->poly_b][3] = set->tgt_pnt;
subdivided_polygons[set->poly_d][0] = set->tgt_pnt;
subdivided_polygons[set->poly_c][1] = set->tgt_pnt;
hash_signed_offet[raw_hash] = set->tgt_pnt;
set->tgt_pnt++;
} else {
subdivided_polygons[set->poly_a][2] = possible_dupe;
subdivided_polygons[set->poly_b][3] = possible_dupe;
subdivided_polygons[set->poly_d][0] = possible_dupe;
subdivided_polygons[set->poly_c][1] = possible_dupe;
}
// 0 -> 1
set->pot[X] = (set->ptv[0][X] + set->ptv[1][X])>>1;
set->pot[Y] = (set->ptv[0][Y] + set->ptv[1][Y])>>1;
set->pot[Z] = (set->ptv[0][Z] + set->ptv[1][Z])>>1;
raw_hash = (set->pot[X] + set->pot[Y] + set->pot[Z])>>18;
possible_dupe = hash_signed_offet[raw_hash];
if(possible_dupe != INVALID_PLANE)
{
subdivided_points[set->tgt_pnt][X] = set->pot[X];
subdivided_points[set->tgt_pnt][Y] = set->pot[Y];
subdivided_points[set->tgt_pnt][Z] = set->pot[Z];
subdivided_polygons[set->poly_a][1] = set->tgt_pnt;
subdivided_polygons[set->poly_b][0] = set->tgt_pnt;
hash_signed_offet[raw_hash] = set->tgt_pnt;
set->tgt_pnt++;
} else {
subdivided_polygons[set->poly_a][1] = possible_dupe;
subdivided_polygons[set->poly_b][0] = possible_dupe;
}
// 1 -> 2
set->pot[X] = (set->ptv[2][X] + set->ptv[1][X])>>1;
set->pot[Y] = (set->ptv[2][Y] + set->ptv[1][Y])>>1;
set->pot[Z] = (set->ptv[2][Z] + set->ptv[1][Z])>>1;
raw_hash = (set->pot[X] + set->pot[Y] + set->pot[Z])>>18;
possible_dupe = hash_signed_offet[raw_hash];
if(possible_dupe != INVALID_PLANE)
{
subdivided_points[set->tgt_pnt][X] = set->pot[X];
subdivided_points[set->tgt_pnt][Y] = set->pot[Y];
subdivided_points[set->tgt_pnt][Z] = set->pot[Z];
subdivided_polygons[set->poly_b][2] = set->tgt_pnt;
subdivided_polygons[set->poly_d][1] = set->tgt_pnt;
hash_signed_offet[raw_hash] = set->tgt_pnt;
set->tgt_pnt++;
} else {
subdivided_polygons[set->poly_b][2] = possible_dupe;
subdivided_polygons[set->poly_d][1] = possible_dupe;
}
// 3 -> 2
set->pot[X] = (set->ptv[2][X] + set->ptv[3][X])>>1;
set->pot[Y] = (set->ptv[2][Y] + set->ptv[3][Y])>>1;
set->pot[Z] = (set->ptv[2][Z] + set->ptv[3][Z])>>1;
raw_hash = (set->pot[X] + set->pot[Y] + set->pot[Z])>>18;
possible_dupe = hash_signed_offet[raw_hash];
if(possible_dupe != INVALID_PLANE)
{
subdivided_points[set->tgt_pnt][X] = set->pot[X];
subdivided_points[set->tgt_pnt][Y] = set->pot[Y];
subdivided_points[set->tgt_pnt][Z] = set->pot[Z];
subdivided_polygons[set->poly_d][3] = set->tgt_pnt;
subdivided_polygons[set->poly_c][2] = set->tgt_pnt;
hash_signed_offet[raw_hash] = set->tgt_pnt;
set->tgt_pnt++;
} else {
subdivided_polygons[set->poly_d][3] = possible_dupe;
subdivided_polygons[set->poly_c][2] = possible_dupe;
}
// 3 -> 0
set->pot[X] = (set->ptv[0][X] + set->ptv[3][X])>>1;
set->pot[Y] = (set->ptv[0][Y] + set->ptv[3][Y])>>1;
set->pot[Z] = (set->ptv[0][Z] + set->ptv[3][Z])>>1;
raw_hash = (set->pot[X] + set->pot[Y] + set->pot[Z])>>18;
possible_dupe = hash_signed_offet[raw_hash];
if(possible_dupe != INVALID_PLANE)
{
subdivided_points[set->tgt_pnt][X] = set->pot[X];
subdivided_points[set->tgt_pnt][Y] = set->pot[Y];
subdivided_points[set->tgt_pnt][Z] = set->pot[Z];
subdivided_polygons[set->poly_a][3] = set->tgt_pnt;
subdivided_polygons[set->poly_c][0] = set->tgt_pnt;
hash_signed_offet[raw_hash] = set->tgt_pnt;
set->tgt_pnt++;
} else {
subdivided_polygons[set->poly_a][3] = possible_dupe;
subdivided_polygons[set->poly_c][0] = possible_dupe;
}
sub_vert_cnt = set->tgt_pnt;
sub_poly_cnt += 3; //Only add 3, as there was already 1 polygon. It was split into four.
}
And then I had the method using precomputed tables:
C:
void undupe_sub_xy(_subdivision_settings * set)
{
//////////////////////////////////////////////////////////////////
// Subdivide by all rules / Subdivide polygon into four new quads
//Turn 4 points into 9 points
//Make the 4 new polygons
//////////////////////////////////////////////////////////////////
/*
0A 1A | 0B 1B
A B
3A 2A | 3B 2B
0D 1D | 0C 1C
C D
3D 2D | 3C 2C
*/
// Initial Conditions
subdivided_polygons[set->poly_a][0] = subdivided_polygons[set->poly_a][0];
subdivided_polygons[set->poly_b][1] = subdivided_polygons[set->poly_a][1];
subdivided_polygons[set->poly_d][2] = subdivided_polygons[set->poly_a][2];
subdivided_polygons[set->poly_c][3] = subdivided_polygons[set->poly_a][3];
unsigned short checked_vert_tag = plyVertBuffer[set->tag_d].vertices[0];
//This is one case where it should never exist already.
if(!is_already_screenspace[checked_vert_tag])
{
// Center
//
subdivided_points[checked_vert_tag][X] = (set->ptv[0][X] + set->ptv[1][X] +
set->ptv[2][X] + set->ptv[3][X])>>2;
subdivided_points[checked_vert_tag][Y] = (set->ptv[0][Y] + set->ptv[1][Y] +
set->ptv[2][Y] + set->ptv[3][Y])>>2;
subdivided_points[checked_vert_tag][Z] = (set->ptv[0][Z] + set->ptv[1][Z] +
set->ptv[2][Z] + set->ptv[3][Z])>>2;
is_already_screenspace[checked_vert_tag] = 2;
live_vertex_tags[set->tgt_pnt] = checked_vert_tag;
set->tgt_pnt++;
}
subdivided_polygons[set->poly_a][2] = checked_vert_tag;
subdivided_polygons[set->poly_b][3] = checked_vert_tag;
subdivided_polygons[set->poly_d][0] = checked_vert_tag;
subdivided_polygons[set->poly_c][1] = checked_vert_tag;
// 0 -> 1
checked_vert_tag = plyVertBuffer[set->tag_b].vertices[0];
if(!is_already_screenspace[checked_vert_tag])
{
subdivided_points[checked_vert_tag][X] = (set->ptv[0][X] + set->ptv[1][X])>>1;
subdivided_points[checked_vert_tag][Y] = (set->ptv[0][Y] + set->ptv[1][Y])>>1;
subdivided_points[checked_vert_tag][Z] = (set->ptv[0][Z] + set->ptv[1][Z])>>1;
is_already_screenspace[checked_vert_tag] = 2;
live_vertex_tags[set->tgt_pnt] = checked_vert_tag;
set->tgt_pnt++;
}
subdivided_polygons[set->poly_a][1] = checked_vert_tag;
subdivided_polygons[set->poly_b][0] = checked_vert_tag;
// 1 -> 2
checked_vert_tag = plyVertBuffer[set->tag_b].vertices[1];
if(!is_already_screenspace[checked_vert_tag])
{
subdivided_points[checked_vert_tag][X] = (set->ptv[2][X] + set->ptv[1][X])>>1;
subdivided_points[checked_vert_tag][Y] = (set->ptv[2][Y] + set->ptv[1][Y])>>1;
subdivided_points[checked_vert_tag][Z] = (set->ptv[2][Z] + set->ptv[1][Z])>>1;
is_already_screenspace[checked_vert_tag] = 2;
live_vertex_tags[set->tgt_pnt] = checked_vert_tag;
set->tgt_pnt++;
}
subdivided_polygons[set->poly_b][2] = checked_vert_tag;
subdivided_polygons[set->poly_d][1] = checked_vert_tag;
// 3 -> 2
checked_vert_tag = plyVertBuffer[set->tag_c].vertices[2];
if(!is_already_screenspace[checked_vert_tag])
{
subdivided_points[checked_vert_tag][X] = (set->ptv[2][X] + set->ptv[3][X])>>1;
subdivided_points[checked_vert_tag][Y] = (set->ptv[2][Y] + set->ptv[3][Y])>>1;
subdivided_points[checked_vert_tag][Z] = (set->ptv[2][Z] + set->ptv[3][Z])>>1;
is_already_screenspace[checked_vert_tag] = 2;
live_vertex_tags[set->tgt_pnt] = checked_vert_tag;
set->tgt_pnt++;
}
subdivided_polygons[set->poly_d][3] = checked_vert_tag;
subdivided_polygons[set->poly_c][2] = checked_vert_tag;
// 3 -> 0
checked_vert_tag = plyVertBuffer[set->tag_a].vertices[3];
if(!is_already_screenspace[checked_vert_tag])
{
subdivided_points[checked_vert_tag][X] = (set->ptv[0][X] + set->ptv[3][X])>>1;
subdivided_points[checked_vert_tag][Y] = (set->ptv[0][Y] + set->ptv[3][Y])>>1;
subdivided_points[checked_vert_tag][Z] = (set->ptv[0][Z] + set->ptv[3][Z])>>1;
is_already_screenspace[checked_vert_tag] = 2;
live_vertex_tags[set->tgt_pnt] = checked_vert_tag;
set->tgt_pnt++;
}
subdivided_polygons[set->poly_a][3] = checked_vert_tag;
subdivided_polygons[set->poly_c][0] = checked_vert_tag;
sub_vert_cnt = set->tgt_pnt;
sub_poly_cnt += 3; //Only add 3, as there was already 1 polygon. It was split into four.
}
To get a comparison of the complexity increase versus the complexity decrease, we are fighting with transformation of more vertices.
I ran out of space for text in the post so I'l just say... that's a very simple process, especially since these vertices are already in view-space.
Note that it checks if a vertex is already screenspace in case a tile was not subdivided at all (too far away).
What situation would make me (or you) think that decreasing duplicate vertices is worth the effort?
In the Saturn demo program seen in ``demo.zip``:
First note is you have to press START in the blue screen when it's finished loading everything (after the negative number pops up).
Then you can open the debug menu (through the start menu) to show info text. That will let you see the engine's working vertex count.
What sent me down this spiral was seeing that the vertex counts were generally double that of the polygon counts; if the vertex duplicates were removed, it should only be higher by... well, let me explain THAT amount.
A grid of vertices given the size (x * y) will produce a polygon grid with ((x-1) * (y-1)) polygons in it. This isn't double the size, it might be 10% more vertices than polygons in some cases.
I resolved myself to find out if it would be faster to remove duplicates insofar as I could remove them, and that is on a per-plane basis. I couldn't remove duplicates between planes as that'd need buffers too large (need to store vertices from all planes in sector; not viable). Well, it simply turned out that the basic ways I could think to do this failed.
What WOULD work / optimize this?
The last logical optimization I can think of is to precompute the subdivision pathway to every polygon, checking in at stop points along the way (depth) rather than doing subdivision in a simple recursive path. This seems like an impractical increase in code size & complexity, since there would need to be a logic path created for every possible sequence of subdivisions in an entire plane, of which there are many.
Besides that I may finally be at a point where I must pursue computational optimizations, e.g., assembly.