PDA

View Full Version : using shared memory on the Streaming multiprocessor



driver
10-20-2014, 12:00 PM
Hi

I have implemented code in a way that stuffs certain lookup values in the shared memory, so that the rest of the threads do not need to look up the same values from global memory.

However, i find that while the first run of the compute shader is fast when using the shared memory versus not using the shared memory, subsequent compute shader dispatches are not. Any reason why? Does there need to be *freeing* or *cleanup* of memory on the shared memory after each compute shader dispatch?

thanks