VTune

Maybe it’s not quite OpenGL question, but I think someone should have worked with this thing & I’m working on OpenGL right now, so I’d like to know is it possible to connect those things together. I’ve installed VTune an tried to run simpliest wizard, the result was - few seconds of work, application start, few more seconds and auto restart Moreover, once with repeated escaping from app as soon as it starts, program finished it’s work, but then showed everything in assembly (I can understand a bit of that, but it’s crazzzzy!), why is that so. I’m working with VC++ 6.0 right now.

And by the way. whats the way to find slowest chain of my PC. I have counter in my app that clocks at ~36 FPS, but that could mean my Vcard or CPU or MEM or AGP is slow, but not all together, so I should pay attention to the problematic part of chain not all PC. Is there good performance testers that checks all those things at time and shows where the problem arises & it would be great to see how much time app spends in vertex&pixel programs stage .

Well, rather OT but good to know:
If VTune shows assembly only, it didn’t find informations about the source code in the analyzed app.
You need to have a *.map file with the symbols to get a per-function view and a full *.pdb generated by the compiler should give you source access.
VTune also shows time per-module, so you can see which part of your system uses most of the clocks.
Be sure you’ll measure with wait for vsysnc disabled, ~36 fps looks like monitor refresh 72 Hz / 2.
You can use very small windows to find geometry and lighting (vertex program) bounds and very high resolutions to find fillrate (pixel programs) bounds.

[This message has been edited by Relic (edited 11-27-2002).]

You can’t collect performance data from your GPU in VTune. It is meant to do it for intel processors. If you want to have access to all performance counters you need at least a P4. Lower (P3,P2, AMD) will not give you access to all counters.
To see the code of your app you have to compile a release build with debug information (always compile a release build when doing performance tests in VTune), then the drilldown will show you the source code of your app. VTune always does at least two runs of your application (depends on which counters you want to monitor), one run to calibrate the counters and one run where the actual data is collected.
Try to find some examples on the intel site to start with VTune, the tool is not that complex but the difficulty is learning to interpret the data is generates and know what to do in the code.

Happy optimizing :wink:

P3 has almost all the counters you want (partial evictions and CPU cycle counter being the ones you usually care the most about).

Here’s a quick run-down for getting useful data out of VTune:

  1. Make sure you profile the release mode version of your program :slight_smile:

  2. Build your binary with full program database debug information (this is not default for MSDEV)

  3. Start VTune. Choose the “Wizard” for starting a profiling project.

  4. In the wizard, make sure you’re using a sampling session for a Win32 app. Specify your binary (.EXE) file. Finish the wizard, but do not start the profile.

  5. Once the wizard is finished, open the Options dialog. Click Sampling Sessions. Say that you want to wait for 10 seconds before starting, and then sample for 20 seconds (tune as necessary).

  6. Click Advanced. Type in the time “0.1 milliseconds” and increase the memory range to at least 2048 kilobytes of sample memory.

  7. Open the Automation options page. Click “Do not start a program” (the last option).

  8. Close the options page. Your project is now set up.

  9. Start your program the normal way (either from the command line, from MSDEV, or by double-clicking it).

  10. Once the program is getting to the point you want to profile, switch back to VTune. Click the little “Play” button in the toolbar. VTune will make its window smaller, and begin a 10-second countdown.

  11. Use the 10-second countdown to switch back to your application and start doing what you want to be profiling.

  12. When VTune has counted down, it will profile the entire system in the background. When it’s done profiling (20 seconds after starting after the countdown) you should quit your application to give VTune all the CPU time.

  13. The VTune GUI and analyzer parts are written in Visual Basic.

  14. You will finally get a view of two windows: the Modules view (listing how much time was spent in each DLL/EXE file) and the Hot Spots view for your application.

  15. Make sure that the Hot Spots view for your application shows the right .pdb file name at the bottom of the window. If not, type in the right path, and press enter, and wait while VTune regenerates the data. This is KEY to get source view.

  16. Switch to source view, and start zooming/double-clicking to explore. Have fun!

At point 15, it’s often useful to export the Function View and the Class View to comma-separated data files that can open in Excel, and then you can sort them and massage them in there to generate good data suitable for public presentation.

Thanks jwatte for your detailed explanation! I’ve been wanting to try Vtune myself but have always been put off by the learning curve.

I was so surprised about tuning advices. “Think about buing faster processor”, “Think about purchasing multiprocessor system” and so on… and thats for app that runs at 36 FPS. I think I havn’t seen anything funnier for last few months. What would be advice for Doom3 or Unreal tuning? “Wait few hundred years, so we could invent Pentium 100 processor”