Understanding the behavior of IDirect3DDevice9::Present when it blocks for vsync

I'm developing a scientific application that has to estimate (as best as possible) the time difference between an object getting drawn in the video back buffer and the point at which that object actually becomes visible on the screen. In other words, how DirectX on Windows XP+ deals with monitor's vertical refresh cycle.

I'll start by saying that my video routines are based on SDL 1.3 library. As a result, I don't have immediate access to DirectX API, but this could be changed if necessary. DirectX is being initialized with D3DSWAPEFFECT_DISCARD, D3DPRESENT_INTERVAL_ONE, and BackBufferCount = 1 in full-screen mode. Those seem to be the most critical parameters, but I'm happy to dig through the rest of SDL code if more information is needed.

The D3DPRESENT_INTERVAL_ONE flag ensures that back and front buffers are swapped no more than once per refresh cycle, and never in the middle of a refresh (it basically enables vsync). Indeed, if I have a simple loop that just continually calls IDirect3DDevice9::Present (SDL_RenderPresent, in my case), this function will block for the number of milliseconds between two refresh cycles (16.67ms with 60Hz, 10ms with 100Hz, etc.).

Here's my question... Suppose I draw a white square in the back buffer and call SDL_RenderPresent, which blocks for 16.67 ms (assuming 60Hz refresh). What can I conclude about the state of the visible image on the monitor when the call to SDL_RenderPresent returns? Here are the possibilities, as I see it:

The white square was just drawn on the monitor.
The white square is about to be drawn (in less than 1 ms).
The previous front buffer was just drawn; it will take another refresh cycle (16.67 ms) before my white square appears (calling SDL_RenderPresent again will get me to case 1).
The previous front buffer was drawn in the last 16.67 ms, my white square is next, but the exact time till the next refresh is unknown.

From all the reading that I've done, I'm leaning toward option 3, but I can't find any guarantees against 4. In my configuration, the Present function should block only if it is being called for the second time during a pause between two refresh cycles. Since the goal is to swap the front and back buffers, the earliest point at which the second call can do this is just after the monitor was refreshed (previous front buffer was just drawn). It is at that point that the back buffer containing my white square can be moved to the front, but it must wait for (at most) 16.67 ms before the monitor will actually read and display the buffer contents. Ideally, I'd like to hear that the function should always return as soon as the previous refresh cycle is finished.

Can anyone more experienced with DirectX provide any insight on this topic? Are my assumptions correct or am I missing something? Will these assumptions always be correct for any system that has DirectX support, or could the logic change depending on the video card, monitor, or some other things?

As a final minor question, going back to my loop that just calls SDL_RenderPresent over and over again, I noticed that the first 3 or 4 calls return immediately, while all subsequent ones wait for the refresh cycle. Am I correct in assuming that the D3DPRESENT_INTERVAL_ONE restriction is simply being ignored prior to the first refresh (as opposed to some sort of queuing taking place with more than 2 buffers that I'm expecting to have)?

In other words, suppose the loop is entered with ~8ms to go until the next refresh cycle. It might be able swap the front and back buffers 4 times during this period. Until that first refresh happens, SDL_RenderPresent will return immediately (since we technically don't have any front buffers for now, only 2 back buffers), but the blocking will start to take place as soon as one of those buffers is shown on the screen. Is this a valid explanation or not?

[edit]

Based on the replies below, it's clear that my approach using vsync and Present would not work. I think I found another way to achieve the desired result, so I'm posting it here in case someone can spot errors in my thinking, or just for the information of anyone else working on a similar problem.

The first step is to get rid of D3DPRESENT_INTERVAL_ONE. That disables vsync and ensures that any call to SDL_RenderPresent will return immediately. Next, you can use IDirect3DDevice9::GetRasterStatus to get information about the current monitor state. It provides a boolean field that is set to true during the pause between two refresh cycles, and another field that tells you the current scanline during an active refresh. Using these two pieces of information it's possible to implement your own vertical synchronization routines, albeit by running a loop that is constantly polling the monitor status and thus consuming 100% of the CPU. This is acceptable for my needs.

There is still the question of buffering - how do I know which frame is to be drawn on the screen when I call SDL_RenderPresent? I think I found a way to determine this, which relies on my ability to know what line on the monitor is currently being drawn. Here's the basic logic:

Wait for a new refresh cycle to start (pause = false, scanline = 0).
Fill the next back buffer with red color and call Present.
Wait for scanline to reach 32.
Fill the next back buffer with green and call Present.

And so on... In my demo implementation I used red, green, blue, and finally black. The idea is that you would see the RGB color pattern only if GetRasterStatus provides accurate information about the refresh status, and the front and back buffers are flipped immediately when SDL_RenderPresent is called. If either of those conditions is not met, you may not see anything, the colors could be swapped or overlapping, etc. If, on the other hand, you see a constant RGB pattern at the top of the screen for each frame, then this proves that you have direct control over the drawn image.

I should add that I tested this theory on several computers at work today. Most did display the pattern, but at least one had the entire screen painted red. A few would have the color bands jump up and down, indicating some inconsistency in swapping the buffers. This usually happened on older machines. I think this is a good calibration test to determine if the hardware is suitable for our testing purposes.

I highly recommend you look at Microsoft's GPUView. Here is a presentation that introduces the tool.

D3D will typically buffer more than one frame worth of rendering commands (including presents). For an example, see slide 25, where we can see ~3 frames being buffered on the BumpEarth Device queue. This explains that the 3-4 calls return immediately (Present packets are the crossed ones). They just get queued.
Unless you're doing full-screen rendering, the OS needs to do some compositing (same slide shows the compositing happening on vsync - the blue vertical line)

Some consequences:

Present returning gives you no guarantee at all on when your just-sent rendering commands will update on screen.
the duration your commands will take to render a frame is not easy to figure out. I've seen applications rely on previously rendered timings, smoothened (to prevent ping-pong rendering changes).

As additional comments:

I've witnessed ~1.5 frame worth of command buffering in real life workloads.
even when the vsync happens and the video card updates the front-buffer, the monitor can still do some buffering internally (more so since we left CRTs behind).

I've got to ask, why do you need to control exactly when the frame shows on screen ?

Thanks for the explanation (even though it's a bit of bad news). I'm writing software that will measure human reaction time for scientific studies. Ideally, I'd like to get the margin of error down to 1 ms, but I doubt that this will be possible without some specialized hardware and a real-time OS. It may come to that eventually.If it's not possible to control how DirectX will buffer frames, could I get better results by disabling the buffers and vsync completely, and using IDirect3DDevice9::GetRasterStatus to determine when it's safe to update the screen?

VokinLoksar 2010-09-23 19:31:54

You can use a query to make SURE the frmae has been flushed to the graphics card ... though even then you cannot be 100% sure of much ... some TFT screens have interesting delays too. You are best off getting a high speed camera and then filming the screen and user. Then count the frames till the user reacts. A 300fps camera would give you 3 ms accuracy so if you can get a 1000fps camera you'd have your ~1ms ... Doing it in code without knowing a LOT more about your system is impossible though ...

Goz 2010-09-24 21:44:25

ansaurus

tags:

views:

answers:

Understanding the behavior of IDirect3DDevice9::Present when it blocks for vsync

related questions