I'm doing some processing on some very large video files (often up to 16MP), and I need a way to store these videos in a format that allows seeking to specific frames (rather than to times, like ffmpeg). I was planning on just rolling my own format that concatenates all of the individually zlib compressed frames together, and then appends an index on the end that links frame numbers to file byte indices. Before I go about this though, I just wanted to check to make sure I'm not duplicating the functionality of another format/library. Has anyone heard of a format/library that allows lossless compression and random access of videos?
Some formats allow you to change the number of key frames per second.
For example, I've used ffmpeg to encode to flv at 25 frames per second with 25 key frames per second, and then used a player that was fine in moving to key frames. Basically this allowed me to do frame by frame seeking.
Also the last time I checked quicktime can do frame by frame seek without having to have each frame being a key frame.
May not be applicable to you but that's my thoughts.
The reason it is hard to seek to a specific frame in most video codecs is that most frames depend on another frame or frames, so frames must be decoded as a group. For this reason, most libraries will only let you seek to the closest I-frame (Intra-frame - independently decodable frame). To actually produce an image from a non-I-frame, data from other frames is required, so you have to decode a number of frames worth of data.
The only ways I have seen this problem solved involve creating an index of some kind on the file. In other words, make a pass through the file and create an index of what frame corresponds to a certain time or section of the file. Since the seeking functions of most libraries are only able to seek to an I frame so you may have to seek to the closest I-frame and then decode from there to the exact frame you want.
If space is not of high importance, I would suggest doing it like you say, but use JPEG compression instead of zlib as it will give you a lot higher compression ratio since it exploits the fact you are dealing with image data.
If space is an issue, P frames (depend on previous frame/frames) can greatly reduce the size of the file. I would not mess with B frames (depend on previous and future frame/frames) since they make it much harder to get things right.
I have solved the problem of seeking to a specific frame in the presence of B and P frames in the past using ffmpeg (libavformat) to demux the video into packets (1 frame's worth of data per packet) and concatenate these into a single file. The important thing is to keep and index into that file so you can find packet bounds for a given frame. If the frame is an I-frame, you can just feed that frame's data into an ffmpeg decoder and it can be decoded. If the frame is a B or P frame, you have to go back to the last I-frame and decode forward from there. This can be quite tricky to get right, especially for B-frames since they are often sent in a different order than how they are displayed.