Quadro bug

I have a program that streams 3d data over network, and then renders it. I have this inexplicable problem where only on quadro cards, as soon as I render a few frames of data, the network connection drops. I’ve tried everything imaginable to simulate this bug on my GTX card, by pausing the network traffic, simulating high CPU load, changing the way I receive the data.

On a quadro card, it consistently breaks. If i don’t call RenderData() the network stream works flawlessly.

It sounds not driver related by. However when I googled the error message "WSAECONNABORTED and recv "
Amazingly the 3rd result on google is someone with an identical issue

That’s pretty strange. Sure sounds like some strange interaction between the NVidia Quadro driver and the network driver. Possibly the Quadro driver is disabling interrupts for too long, using too much buffer space, or otherwise somehow starving out the network driver in some way.

Overall, a few things you might try to investigate this one involve: 1) monitoring your app and the driver, 2) modifying your app, and 3) modifying the NVidia driver config.

  1. I’d uses Resource Monitor, Process Explorer, or whatever you can get your hands on to keep an eye on both 1) CPU/memory consumption for your application as well as 2) CPU/memory consumption for the NVidia driver and your network driver (or possibly the kernel in general). Hopefully that’ll provide some clues. Do you see high CPU? Do you see memory growth?

  2. I would try modifying your GL app to see if you can isolate what your app is doing that instigates this problem (if anything). For instance, are you free-running? If so, enable VSync (Sync-to-VBlank) (and time your frame loop to verify that you’re getting it). Try pumping up the SwapInterval to > 1. Try disabling most of your draw submission code and just do a Clear and Swap of your window. Gradually re-enable certain sub-portions of your frame draw code. Possibly slow things down by putting a glFinish() after Swap, and possibly after certain strategic points during your frame. Try reducing window resolution and/or hiding the window altogether.

  3. The other thing I would try is modifying your NVidia driver config (or the environment in which it operates) to see if you can make the network problem go away. For instance, to reduce memory consumption/bandwidth, try reducing the size of your virtual screen / desktop. If multiple monitors, cut down to one. If one, reduce the resolution. Try forcing the driver to run single-threaded (if that’s an option on Windows). Modify how your system handles IRQ routing and APIC (see below). I realize you’re on windows, but here are a few sections of the NVidia Linux driver README.txt caught my eye when I scanned it w.r.t. your problem.

Hi Dark Photon, thanks for the reply.

[QUOTE=Dark Photon;1283540]That’s pretty strange. Sure sounds like some strange interaction between the NVidia Quadro driver and the network driver. Possibly the Quadro driver is disabling interrupts for too long, using too much buffer space, or otherwise somehow starving out the network driver in some way.
[/QUOTE]

That was my guess. Perhaps it was somehow flushing the TCP buffer. Normally receiving TCP is not time dependant. UDP will just throw packets away if the buffer is full or you don’t receive them fast enough.

I found a work around, setting the driver profile to dynamic streaming seems to fix it. I don’t know if it actually fixes it, or just minimises the problem so I don’t notice it. It also doesn’t happen if I have a debugger attached. So I am guessing its some sort of timing issue. I am not even sure what could possibly break the TCP connection.

The documentation for recv says this

Winsock may need to wait for a network event before the call can complete. Winsock performs an alertable wait in this situation, which can be interrupted by an asynchronous procedure call (APC) scheduled on the same thread.

Wonder if the driver could be doing that.

Interesting. Good find! Websearching that option (out of curiosity) reveals quite a few posts recommending that folks flip their Quadros to that mode (or Visual Simulation) to solve performance problems and crash issues.

One of those hits has a familiar description:

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.