tri strips/fans hardware side: reset, degenerates, and anything else

it looks like i need to build a triangle stripper… for a couple years i’ve been using nvidia’s utility, but it has a tendancy to produce swiss cheese meshes (which i would consider a bug). however at this point i also have tighter constraints… the stripper should be able to flip edges inside quads and maybe integrate fans.

so i have a few questions… if someone could just point me to documentation that would suffice my curiosity but discussion is welcomed as well.

basicly i’m curious about triangle strips on the hardware side. i’m curious about the rules surrounding degenerate triangles… and i get the impression that there is a way to embed a command to the processor to reset the stripping process inside the strip indices.

i’m also curious if there is a way to somehow get the processor to switch from strip mode to fan mode without resetting the cache or changing out indices… maybe by sending a -1 index or something.

i’m also curious about any information falling within this category… any hardware tool i can use basicly.

michael

Perhaps you could modify the soure of these stippers:
http://www.plunk.org/~grantham/public/meshifier/oldmesh.html

http://users.pandora.be/tfautre/softdev/tristripper/

AFAIK, one of the interesting features of nvidia’s tri stripper is the consideration of cache coherency w.r.t. the reuse of indexes in the mesh. Bearing that in mind you might want to check out the following links:

http://plunk.org/~grantham/public/actc/

http://www.cs.sunysb.edu/~stripe/

I’d strongly suggest avoiding the old striper by Grantham and look at his newer ACTC.

yeah, hardware cacheing is the only thing that really scares me away from this project… i really don’t want to have know the hardware to the last detail, assuming i even find the energy to take things so far.

i will probably start with the nvidia stripper source as i figure it has a good bias towards the hardware.

anyone know anything about the possibility of changing primitive mode midstream? that would be a nice feature wouldn’t it?

the meshes i have to work with are relatively small and specialized. automated actually… no way to manage them by hand. their structure is also generally lowsey for stripping, hence the desire to flip edges. i worry that the nvidia stripper might be having a hard time with them, though there aren’t any holes like i often get with large meshes. the meshes look more like fan meshes, so i worry that they may require a lot of short strips and or degenerate triangles.

i would like to be able to find the penultimate optimal stripping for each mesh.

There are extensions for changing the primitive mode:

GL_NV_primitive_restart,
GL_SUN_triangle_list.

i have a few questions about the nvidia primitive restart externsion. here is an excerpt from the specs:

*What should the default primitive restart index be?

RESOLVED: Zero. It’s tough to pick another number that is meaningful for all three element data types. In practice, apps are likely to set it to 0xFFFF or 0xFFFFFFFF.

i have had some issues with the nvidia triangle stripper utility. i was under the impression that it was be default using primitive restart in its output strip(s), passed in a single buffer.

this excerpt seems to either imply that there is no default restart index, or that the index is by default ‘zero’, or 0.

i wonder if i need to enable the restart, and if the stripper is embedding zeros then that might explain why my meshes sometimes come out with holes, though 95% of the time they are fine, and when they do have holes there are no other sort of anomalies like bogus triangles.

however i did try the latest library which actually produced worse meshes with some of the models that had produced meshes with holes in the 2001 library. it did have a bogus triangle, which might have been going back to vertex zero. however lately i’ve been using the newest version of the stripper extensively without any apparent issues.

so really i’m a bit confused about it all… i would look at the stripper source and try some expiriments, but my development machine has been running the last couple of days building a data base and i’d rather just get that over with first. takes about an hour to stop and pick back up anyhow.

so anyhow, if anyone uses the nvidia stripper along with primitive restart and would like to clarify some of this for me with a little bit of code, i would be very grateful.

and for anyone at nvidia, wouldn’t it be useful for a primitive restart that actually changes the primitive mode. being able to change between strips and fans on the fly would be useful… would be more useful if the hardware could change modes without missing a step… like going from the end of a strip to a fan where the last two strip vertices are the first two of the fan and then back from the fan to the triangle.

sincerely,

michael

Primitive restart is disabled by default. The default primitive restart index is zero. You do not have to enable primitive restart, it should work fine with it disabled.

In my app I do this
glEnableClientState(GL_PRIMITIVE_RESTART_NV);
glPrimitiveRestartIndexNV(65535);

and then

SetCacheSize(24);
SetStitchStrips(1);
EnableRestart(65535);
GenerateStrips(pIndex, NumIndices, &pPrimitiveGroup, &NbGroups);

so teh generatestrip routine actually queries the restart states…

thanks a lot, good to have a solid answer. assuming you are 100% sure your methods are correct.

btw, anyone want to offer a run down on the cache sizes for recent cards… with my own card right now i’ve been going with 24 which is recommended for geforce3 i think… but the card these days is a quadrofx series, though one of the low end more experimental models… still i’m curious if 24 is still the optimal cache size for modern cards.

michael

No, I think you have misunderstood me.

If you want to use primitive restart you must call EnableRestart.

If you dont want to use primitive restart you dont have to call DisableRestart since the default state is disabled.

generatestrip does not query the restart states.

"Primitive restart is disabled by default. The default primitive restart index is zero. You do not have to enable primitive restart, it should work fine with it disabled. "

i don’t have it in front of me, but in a header with the nvidia c++ source i believe it says that primitive restart is used by default. i will have to take a look at the file, i believe this would mean that it would embed restart indices inside the strip array. do if it doesn’t query teh state then how does it know what index to use for your restart? does it assume 65535? i seem to recall that generate strips does not have overloads for non unsigned 16bit types. this really isn’t a problem as i’m sure it is best used as a preprocessing tool, but how then do you communicate your restart index to it, or does it just assume 65535 which you can change later if needed?

sorry for the trouble btw, i should probably just wait until my development machine is handy to continue any possible further discussion.

just for the record, i might look into those third party stripper efforts in a bit.

michael

Originally posted by michagl:
i don’t have it in front of me, but in a header with the nvidia c++ source i believe it says that primitive restart is used by default.

if it doesn’t query teh state then how does it know what index to use for your restart? does it assume 65535?
From the header…

////////////////////////////////////////////////////////////////////////////////////////
// EnableRestart()
//
// For GPUs that support primitive restart, this sets a value as the restart index
//
// Restart is meaningless if strips are not being stitched together, so enabling restart
// makes NvTriStrip forcing stitching. So, you’ll get back one strip.
//
// Default value: disabled
//
void EnableRestart(const unsigned int restartVal);

yeah, i feel like a total ass now, seeing that line in your original post:

EnableRestart(65535);

i don’t know how i missed it. not accustomed to reading your coding style maybe… i still swear that i recall reading, probably in that header, that restart was the default. maybe it was stitching is default, and i originally jumped to the conclusion that restart would facilitate the stitching, not understanding at the time exactly what restart is technicly… or at least not having a clue that there was an API for it or anything of the sort… i assumed it was a new built in feature or something.

anyhow, i’m real sorry about not noticing that. in the future hopefully this incident will serve as a reminder to pay more attention.

sincerely,

michael

so i have NvTriStrip.h in front of me, and i’m just curious about a couple more things if you or anyone is game.

how do you get back strips that are not stitched together and don’t utilize restart?

also curious about the last function RemapIndices… its not one hundred percent clear, but as i imagine this function takes a strip and reformats it according to specifications. that should come in handy.

Originally posted by michagl:
[b]so i have NvTriStrip.h in front of me, and i’m just curious about a couple more things if you or anyone is game.

how do you get back strips that are not stitched together and don’t utilize restart?

also curious about the last function RemapIndices… its not one hundred percent clear, but as i imagine this function takes a strip and reformats it according to specifications. that should come in handy.[/b]
so i’ve gotten around to looking at the RemapIndices function… and to answer my own question i believe that if strip stitching was disabled the ‘numGroups’ would be set to greater than 1.

i realize i must appear to be either stupid or at best short sighted at this point…

but just to confound matters. it was my thinking that RemapIndices would basicly take an input strip and give back a new strip according to the present options.

but after deeper investigation this definately does not appear to be the case. it appears that some how the function returns a remapping of your vertex buffer rather than your indices… then it is at that point somehow your responsibility to remap your vertex buffer for ‘improved spacial locality’…

if someone would like to explain this notion of spacial locality to me i would apreciate it. i guess i would presume that it would be more reasonable to improve the spacial locality of the indices rather than the actual vertices… does hardware for some reason like vertices to ajacent to one another or even in the same general region? i can understand that for non index based rendering a pointer might be more easilly incrimented along a vertex buffer, but i assumed that on hardware, assuming vertices are generally within the same segment, that there would be no optimization benefit to spacial locality within the vertex buffer. is there some kind of block cacheing that goes on with vertices on hardware?

any info is welcomed. i’m actually considering starting a new hardware performance thread due to drastic performance anomolies i’ve observed lately. not sure if it would just be better to address in this thread or not.

sincerely,

michael

I think remap indices is there to allow you to find out which vertex from the original set you passed into the tristripper corresponds to which vertex in the output. NVTriStrip does all the reorganizing of the vertices itself; however if you have extra per-vertex information other than position, texcoords, and normals, then you would need this to carry this extra information over from your original array to the new array (since the tristripper does not move the extra information when it is reorganizing the vertices).

I think that RemapIndices is an optional additional optimisation. It is quite seperate from the generatestrips function.

The way I think it works is you pass in your indices and it reorders them to be more optimal. You must then manually reorder your vertex data to corrrespond to the new indice list.

So say your indice list was 1,3,2 and remapindices returns them as 1,2,3 you would have to swap you second and third vertex in the vertex buffer.

Originally posted by Adrian:
[b]I think that RemapIndices is an optional additional optimisation. It is quite seperate from the generatestrips function.

The way I think it works is you pass in your indices and it reorders them to be more optimal. You must then manually reorder your vertex data to corrrespond to the new indice list.

So say your indice list was 1,3,2 and remapindices returns them as 1,2,3 you would have to swap you second and third vertex in the vertex buffer.[/b]
thats how i read it as well i think… but i just can’t see the usefullness of such an optimization… but then i don’t pretend to understand the hardware or drivers intimately as well.

i plan to discuss this stuff and some other stuff further… but i have about 200 thousand small meshes which need to be restripped, and i’ve fouled it up 2 times but expect the 3rd to be the charm.

the process takes about 24 hours… the first time i woke up bleary eyed, broke out of the program to check the progress… then looking at the code in the debugger, i saw what at first glance appeared to be a very serious memory bug in the code… but it wasn’t on closer examination… but i had exited by that time for some reason… moral of the story, don’t fool with code straight out of bed in the morning.

second time, i had stuck in a last minute change in the disk writing code, where i copied and edited a similar line which included an &ddress operator i forgot to remove… so that costed me that run.

this run aught to go over well though… should be finished by tomorrow morning.

then maybe i will discuss a new and very promising arbitrary geometry ROAM aproach which has matured very well as of late and i’m very excited about. i believe it may aproach theoretical limits very closely, and just might be the dawning of a new ‘era’ of massive infinitely scalable geometric data processing. i will be mostly interested in discussing very fine level hardware concerns.

Originally posted by michagl:
thats how i read it as well i think… but i just can’t see the usefullness of such an optimization…
[…]
the process takes about 24 hours…
[…]

The remapping of the indices is to render the access of vertex data more sequential; thus improving memory access.

One of the reasons why I wrote Tri Stripper was because NvTriStrip is so slooooowww. :smiley:
AFAIK NvTriStrip has some algorithms that have a complexity of O(n^2).

Originally posted by GPSnoopy:
[b]The remapping of the indices is to render the access of vertex data more sequential; thus improving memory access.

One of the reasons why I wrote Tri Stripper was because NvTriStrip is so slooooowww. :smiley:
AFAIK NvTriStrip has some algorithms that have a complexity of O(n^2).[/b]
i would be grateful if you could try to explain that first line in detail, or maybe point me to technical docs.

as for your ‘Tri Stripper’, i’m not familiar with it. i figure the nvidia tristripper is so slow because it starts with no connectivity information. i’ve thought about hacking to allow that info to be uploaded, but for now i’d rather just crack my whip at a one of my poor machines. i feel particularly bad about it because i’ve made the poor thing do it three times now for no good reason.

if your TriStripper produces the same or better hardware optimizations than nvidias, then i would be happy to borrow it. i will eventually have to build a little standard io utility to do this once i get around to hosting it publicly. i plan to hack the triangle stripper to speed it up as much as possible for my particular situation, but not sacrifice any hardware optimizations.

i’m pretty happy with the nvidia stripper… but after i get around to speeding it up some, i might consider trying to hack it so that it can flip quads to get better strips.

edit: oh i see that your tristripper is one of the recommended strippers… the one with the big gaudy drop shadowed red lettered framed internet prescence if i recall correctly.

do you care to discuss it? its cache coherency capabilities… can it do fans? if so can it determine whether a mesh would be better stripped as fans or strips from the perspective of the cache?