



When I try to create pipeline that uses H264 to transmit video, I get some enormous delay, up to 10 seconds to transmit video from my machine to... my machine! This is unacceptable for my goals and I'd like to consult StackOverflow over what I (or someone else) do wrong.

I took pipelines from gstrtpbin documentation page and slightly modified them to use Speex:

This is sender pipeline: #!/bin/sh

gst-launch -v gstrtpbin name=rtpbin \
        v4l2src ! ffmpegcolorspace ! ffenc_h263 ! rtph263ppay ! rtpbin.send_rtp_sink_0 \
                  rtpbin.send_rtp_src_0 ! udpsink host= port=5000                            \
                  rtpbin.send_rtcp_src_0 ! udpsink host= port=5001 sync=false async=false    \
                  udpsrc port=5005 ! rtpbin.recv_rtcp_sink_0                           \
        pulsesrc ! audioconvert ! audioresample  ! audio/x-raw-int,rate=16000 !    \
                  speexenc bitrate=16000 ! rtpspeexpay ! rtpbin.send_rtp_sink_1                   \
                  rtpbin.send_rtp_src_1 ! udpsink host= port=5002                            \
                  rtpbin.send_rtcp_src_1 ! udpsink host= port=5003 sync=false async=false    \
                  udpsrc port=5007 ! rtpbin.recv_rtcp_sink_1

Receiver pipeline:


gst-launch -v\
    gstrtpbin name=rtpbin                                          \
    udpsrc caps="application/x-rtp,media=(string)video, clock-rate=(int)90000, encoding-name=(string)H263-1998" \
            port=5000 ! rtpbin.recv_rtp_sink_0                                \
        rtpbin. ! rtph263pdepay ! ffdec_h263 ! xvimagesink                    \
     udpsrc port=5001 ! rtpbin.recv_rtcp_sink_0                               \
     rtpbin.send_rtcp_src_0 ! udpsink port=5005 sync=false async=false        \
    udpsrc caps="application/x-rtp,media=(string)audio, clock-rate=(int)16000, encoding-name=(string)SPEEX, encoding-params=(string)1, payload=(int)110" \
            port=5002 ! rtpbin.recv_rtp_sink_1                                \
        rtpbin. ! rtpspeexdepay ! speexdec ! audioresample ! audioconvert ! alsasink \
     udpsrc port=5003 ! rtpbin.recv_rtcp_sink_1                               \
     rtpbin.send_rtcp_src_1 ! udpsink host= port=5007 sync=false async=false

Those pipelines, a combination of H263 and Speex, work fine enough. I snap my fingers near camera and micropohne and then I see movement and hear sound at the same time.

Then I changed pipelines to use H264 along the video path.

The sender becomes: #!/bin/sh

gst-launch -v gstrtpbin name=rtpbin \
        v4l2src ! ffmpegcolorspace ! x264enc bitrate=300 ! rtph264pay ! rtpbin.send_rtp_sink_0 \
                  rtpbin.send_rtp_src_0 ! udpsink host= port=5000                            \
                  rtpbin.send_rtcp_src_0 ! udpsink host= port=5001 sync=false async=false    \
                  udpsrc port=5005 ! rtpbin.recv_rtcp_sink_0                           \
        pulsesrc ! audioconvert ! audioresample  ! audio/x-raw-int,rate=16000 !    \
                  speexenc bitrate=16000 ! rtpspeexpay ! rtpbin.send_rtp_sink_1                   \
                  rtpbin.send_rtp_src_1 ! udpsink host= port=5002                            \
                  rtpbin.send_rtcp_src_1 ! udpsink host= port=5003 sync=false async=false    \
                  udpsrc port=5007 ! rtpbin.recv_rtcp_sink_1

And receiver becomes: #!/bin/sh

gst-launch -v\
    gstrtpbin name=rtpbin                                          \
    udpsrc caps="application/x-rtp,media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" \
            port=5000 ! rtpbin.recv_rtp_sink_0                                \
        rtpbin. ! rtph264depay ! ffdec_h264 ! xvimagesink                    \
     udpsrc port=5001 ! rtpbin.recv_rtcp_sink_0                               \
     rtpbin.send_rtcp_src_0 ! udpsink port=5005 sync=false async=false        \
    udpsrc caps="application/x-rtp,media=(string)audio, clock-rate=(int)16000, encoding-name=(string)SPEEX, encoding-params=(string)1, payload=(int)110" \
            port=5002 ! rtpbin.recv_rtp_sink_1                                \
        rtpbin. ! rtpspeexdepay ! speexdec ! audioresample ! audioconvert ! alsasink \
     udpsrc port=5003 ! rtpbin.recv_rtcp_sink_1                               \
     rtpbin.send_rtcp_src_1 ! udpsink host= port=5007 sync=false async=false

This is what happen under Ubuntu 10.04. I didn't noticed such huge delays on Ubuntu 9.04 - the delays there was in range 2-3 seconds, AFAIR.

+1  A: 

Something in there is buffering, most likely the encoder. The more data it has to work with, the more effective compression it can achieve. I'm not familiar with that encoder, but there is usually a setting for the amount of buffering.

I think, that problem isn't in buffering. I tried to introduce delay (gstreamer element "shift" from gentrans) into the audio part of the pipeline - I get even longer delay at start, audio got delayed and video got delayed even more.I think, it's weird synchronization problem.
Serguey Zefirov
+1  A: 

With some help from "Sharktooth" in #x264 on Freenode, I found out that the git version of gst-plugins-ugly supports the "zero-latency" preset.

I tweaked your example to set "x264enc pass=qual quantizer=20 tune=zerolatency", and the latency seems to stay at 0.7 - 0.9 seconds. I can't figure out how to get it any lower.


Something in there is buffering, most likely the encoder.

The incoming data is first stored into a queue in x264. kindly change the same so as the delay will be reduced
