Problems with Cg

For every local parameter setting, Cg makes several thousand unnecessary GetProgram and BindProgram calls.
Same problem for all Cg versions so far.

I found that this was brought up on other forums (gamedev, etc.) too, but no solution was found.
I even asked nvidia, but didn’t get a reply yet.

So has anyone the same problem and/or found a solution for this?

Thanks!

I have the exact same issue (I was the one that brought it up over at gamedev). Eight THOUSAND API calls per frame! Most of which are completely pointless!

My opinion is that they build the libs with the debugging stuff turned on (to make sure everything is what it should be at run time), and released them that way. However, I would think that they’d make sure that wasn’t the case, as people trying to choose between Cg and something else would be hard pressed to choose Cg if it made their application slow to a crawl.

I too emailed Nvidia and never got a response back. Maybe someone who works there will read this…

Hi holdeWaldfee and PfhorSlayer,

I’m sorry you haven’t gotten a response to your questions about Cg. If you’ll send them directly to me (cass@nvidia.com), I’ll make sure they get answered.

We are working on a number of issues related to the frequency of GL API calls in the Cg runtime. Specifically, we have the following issues:

Problem: checking for GL errors
Solution: provide a Cg runtime call to disable this behavior (still useful for debug, but not release)

Problem: glGet/glBind/glUniform/glBind(oldprog)
Solution 1: provide Cg runtime caching of parameter updates, defer them until program bind
Solution 2: GL provides glUniformDirect( program, … ) and glBufferSubDataDirect( buffer_object, … ) or similar calls so that updates can happen immediately.
Note: We’re working on both fronts, but solution 2 is the most desirable, as it fixes a problem with the OpenGL API.

Problem: CgFX API is inefficient for state setting.
Solution: This is a generic OpenGL middleware problem that the ARB is working on now. I personally think we need light-weight display list -like objects for communicating large chunks of state changes efficiently. But it’s unclear exactly how the ARB will decide to address this problem.

I hope this answers some of your questions. If you have other specific ones, feel free to reply on this thread or email me directly.

You should see some of these issues addressed in the next release of Cg 1.5 (which should be non-beta).

Thanks -
Cass

Hi cass, I have another problem: The function “cgCreateProgram” sometimes returns error “Invalid parameter handle”, What about this error? How can I avoid this error? The problem ecountered at several ATI cards.

The most hard problem for us was, that every cgGLSetParameter() call was too slow.
As we traced, it made lots of unnecessary program rebindings, as Cass mentioned above.
When we implemented some sort of caching (update these uniforms when changed only and do it on real program bind), we’ve got incredible boost and CPU workload was decreased greatly, giving us CPU power for other jobs.
But the problems, named 1st and 3rd, are still actual.

did you use arb or glsl as profile ?

jackis, what do you mean with “real program bind” ? how do you prevent cg from rebinding ?

Mmm, CrazyButcher, whom was your question addressed for?

We’ve used ARB and NV native profiles. No glslf, as Cg version 1.3 didn’t support it.

I refered to Johnson.

and do you meant with real bind, that you extracted the “arb_program_parameter” stuff yourself and did use those functions ?

CrazyButcher

Yes, firstly we’ve just cached calls to cgGLSetParameterXfv(), that gave good boost. But we decided to go further, and to free our program loop from any CG API calls, that gave us a little bit more CPU overhead )) So on shader loading we extract it’s ARB parameter and bind it in a straight way (for not to switch programs, as Cass mentioned their API does).

Originally posted by Jackis:
[b] CrazyButcher

Yes, firstly we’ve just cached calls to cgGLSetParameterXfv(), that gave good boost. But we decided to go further, and to free our program loop from any CG API calls, that gave us a little bit more CPU overhead )) So on shader loading we extract it’s ARB parameter and bind it in a straight way (for not to switch programs, as Cass mentioned their API does). [/b]
Sorry to bring up this dead thread, but are you saying that you basically just got all the data you needed at shader load time (ie, the OGL shader objects, figured out what units textures should be bound to, etc), and then used that instead of the Cg API to set up stuff for rendering?

Is there anything special I have to do to get a shader (defined in a cgfx file) that works fine using arbvp1/arbfp1 to work with glslv/glslf. I am using Cg 1.5 (non beta).

Hi,

<rant>

I guess I should have done my homework better because now I’ve run head-first into the problems mentioned in the beginning of this thread.

My problem is that calling

cgGLSetParameterArray4f(parameter, 0, count, data)

with a vp40 profile doesn’t call

glProgramParameters4fvNV(GL_VERTEX_PROGRAM_NV, location, count, data)

but instead does something like this internally (2.0 beta):

for (int i = 0; i < count; ++i)
{
     double temp[4] = {data[i * 4], data[i * 4 + 1]...};
     glProgramParameter4dvNV(GL_VERTEX_PROGRAM_NV, location + i, temp);
}

Why isn’t this insanity fixed? They’re at version 2.0 (beta) now and it isn’t exactly like nobody has pointed out the problem.

I have two versions of a piece of code that renders a lot of skinned objects, one Cg and one GLSL and the plain GLSL version is nearly 1000% (yes you read correctly) faster!

Doesn’t nvidia want people to use their API other than for toy projects?

</rant>

/A.B.