views:

904

answers:

1

Hi there,

I'm writing custom DirectShow source push filter which is supposed to receive RTP data from video server and push them to the renderer. I wrote a CVideoPushPin class which inherits from CSourceStream and CVideoReceiverThread class which is a wrapper for a thread that receive RTP packets from video server. The receiver thread essentially does three things:

  • receives raw RTP packets and collects some data that is needed for Receiver Reports
  • assembles frames, copies them to the buffer and stores information about them into 256 element queue, which is defined as follows:

    struct queue_elem {
       char *start; // Pointer to a frame in a buffer
       int length; // Lenght of data
       REFERENCE_TIME recvTime; // Timestamp when the frame was received (stream time)
    };
    
    
    struct data {
       struct queue_elem queue[QUEUE_LENGTH];
       int qWrIdx;
       int qRdIdx;
    HANDLE mutex;
    };
    
  • every received frame is timestamped with current stream time

    p->StreamTime(refTime);
    REFERENCE_TIME rt = refTime.GetUnits();
    

The problems is that I'm not sure how do I have to set timestamps for every MediaSample in FillBuffer method. I tried several ways, but the playback either stops or it is too slow. Currently the FillBuffer method looks like this:

   REFERENCE_TIME thisFrameStartTime, thisFrameEndTime;
// Make sure if there are at least 4 frames in the buffer
    if(noOfFrames >= 4)
    {   
        currentQe = m_myData.queue[m_myData.qRdIdx++]; //Take current frame description     
        if(m_myData.qRdIdx >= QUEUE_LENGTH)
        {
            m_myData.qRdIdx = 0;
        }           
        nextQe = m_myData.queue[m_myData.qRdIdx]; //Take next frame description
        if(currentQe.length > 0)
        {
            memcpy(pData, currentQe.start, currentQe.length);               

             pSample->SetActualDataLength(currentQe.length);                
            CRefTime refTime;
            m_pFilter->StreamTime(refTime);
            REFERENCE_TIME rt;
            rt = refTime.GetUnits();
            pSample->GetTime(&thisFrameStartTime, &thisFrameEndTime);
            thisFrameEndTime = thisFrameStartTime + (nextQe.recvTime - currentQe.recvTime);
            pSample->SetTime(&thisFrameStartTime, &thisFrameEndTime);   
        }
    }
    else 
    {
        pSample->SetActualDataLength(0);
    }

In this case I noticed that the number of items in the queue increases very quickly (for some reason FillBuffer method cannot pull out data fast enough), and the result is increasing delay when playing video. Does anybody have a idea how should I do the timestamping when receiving data from live sources?

+2  A: 

The renderer will draw the frames when the graph's stream time reaches the timestamp on the sample object. If I read your code correctly, you are timestamping them with the stream time at arrival, so they will always be late at rendering. This is confused somewhat by the audio renderer: if the audio renderer is providing the graph's clock, then it will report the current stream time to be whatever sample it is currently playing, and that is going to cause some undesirable time behaviour.

  1. You want to set a time in the future, to allow for the latency through the graph and any buffering in your filter. Try setting a time perhaps 300ms into the future (stream time now + 300ms).

  2. You want to be consistent between frames, so don't timestamp them based on the arrival time of each frame. Use the RTP timestamp for each frame, and set the baseline for the first one to be 300ms into the future; subsequent frames are then (rtp - rtp_at_baseline) + dshow baseline (with appropriate unit conversions.

  3. You need to timestamp the audio and the video streams in the same way, using the same baseline. However, if I remember, RTP timestamps have a different baseline in each stream, so you need to use the RTCP packets to convert RTP timestamps to (absolute) NTP time, and then convert NTP to directshow using your initial baseline (baseline NTP = dshow stream time now + 300ms).

G

Geraint Davies
Geraint, Thanks for your input. I made some changes in my code, however, the video freezes just after I run it. In the log file I noticed that the FillBuffer method gets called only twice. When it is called for the first time, stream time is 3633950000, frameStartTime is 3635700000 and frameEndTime is 3635703600. For the second time, stream time is 3634370000, frameStartTime is 3635703600 and frameEndTime is 3635707200. So if I understand correctly, renderer should wait for the stream time to reach the timestamp on the first frame and then run smoothly, but unfortunately it doesn't happen.
mkurek