I'd use DirectShow.NET, because it'll let you do a lot of the work in managed code, which is quite a bit more friendly than doing it in native code.
You'll have to construct a filter graph to render the file you want, and you'll also need a file reader for the format of the file (i.e. if it's an MP4 file, you'll need an MP4 demux), and you'll need a decoder for the format of video (i.e. if it's H264, you'll need an H264 decoder filter). I'd use Windows7 if possible, it has much better media support.
Your graph should look something like:
File Reader -> Video Decoder -> Sample Grabber -> Null Renderer
You'll construct your graph, and then call IMediaSeeking to seek to the approximate time of the sample you want. Then run the graph. The decompressed frames will come in through a Sample Grabber callback interface. You can check the timestamps and get the one that's closest to what you need.
From there, you can use .NET to save it as whatever image format you like (JPEG is probably best).