views:

817

answers:

4

I set profile_idc, level_idc, extradata et extradata_size of AvCodecContext with the profile-level-id et sprop-parameter-set of the SDP.

I separate the decoding of Coded Slice, SPS, PPS and NAL_IDR_SLICE packet :

Init:

uint8_t start_sequence[]= {0, 0, 1}; int size= recv(id_de_la_socket,(char*) rtpReceive,65535,0);

Coded Slice :

char *z = new char[size-16+sizeof(start_sequence)];
    memcpy(z,&start_sequence,sizeof(start_sequence));
    memcpy(z+sizeof(start_sequence),rtpReceive+16,size-16);
    ConsumedBytes = avcodec_decode_video(codecContext,pFrame,&GotPicture,(uint8_t*)z,size-16+sizeof(start_sequence));
    delete z;

Result: ConsumedBytes >0 and GotPicture >0 (often)

SPS and PPS :

identical code. Result: ConsumedBytes >0 and GotPicture =0

It's normal I think

When I find a new couple SPS/PPS, I update extradata and extrada_size with the payloads of this packet and their size.

NAL_IDR_SLICE :

The Nal unit type is 28 =>idr Frame are fragmented therefor I tryed two method to decode

1) I prefix the first fragment (without RTP header) with the sequence 0x000001 and send it to avcodec_decode_video. Then I send the rest of fragments to this function.

2) I prefix the first fragment (without RTP header) with the sequence 0x000001 and concatenate the rest of fragments to it. I send this buffer to decoder.

In both cases, I have no error (ConsumedBytes >0) but I detect no frame (GotPicture = 0) ...

What is the problem ?

+1  A: 

I don't know about the rest of your implementation, but it seems likely the 'fragments' you are receiving are NAL units. Therefore each, each may need the the NALU start-code (00 00 01 or 00 00 00 01) appended when you reconstruct the bitstream before sending it to ffmpeg.

At any rate, you might find the RFC for H264 RTP packetization useful:

http://www.rfc-editor.org/rfc/rfc3984.txt

Hope this helps!

Scott Danahy
I don't have enough karma to comment on your question or answer below, but are you appending the NALU startcode before to EACH 'fragment'?
Scott Danahy
You don't need to do that... Fragments are parts of one IDR. NALU is transmitted only in first fragment, not each one. To decode it, you totally don't need to add no start code, because NAL unit defines the H264 payload that follows it (lower 5 bits do that).
Cipi
A: 

RFC says : "if the fragmentation unit payloads of consecutive FUs are sequentially concatenated, the payload of the fragmented NAL unit can be reconstructed."

And then : " If a decapsulated packet is an FU-A, all the fragments of the fragmented NAL unit are concatenated and passed to the decoder."

Before sending it to ffmpeg I'm appending the NALU start-Code with the concatenated sequence

It's what I'm doing in 2) but no result ...

bben
Please edit your question instead of giving an answer that is not a real one ...
neuro
sorry I was not conected when I posted, therefore, I can't edit my question
bben
+1  A: 

In RTP all H264 I-Frames (IDRs) are usualy fragmented. When you receive RTP you first must skip the header (usualy first 12 bytes) and then get to the NAL unit (first payload byte). If the NAL is 28 (1C) then it means that following payload represents one H264 IDR (I-Frame) fragment and that you need to collect all of them to reconstruct H264 IDR (I-Frame).

Fragmentation occurs because of the limited MTU, and much larger IDR. One fragment can look like this:

Fragment that has START BIT = 1:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
Second byte: [ START BIT | RESERVED BIT | END BIT | 5 NAL UNIT BITS] 
Other bytes: [... IDR FRAGMENT DATA...]

Other fragments:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS]  
Other bytes: [... IDR FRAGMENT DATA...]

To reconstruct IDR you must collect this info:

int fragment_type = Data[0] & 0x1F;
int nal_type = Data[1] & 0x1F;
int start_bit = Data[1] & 0x80;
int end_bit = Data[1] & 0x40;

If fragment_type == 28 then payload following it is one fragment of IDR. Next check is start_bit set, if it is, then that fragment is the first one in a sequence. You use it to reconstruct IDR's NAL byte by taking the first 3 bits from first payload byte (3 NAL UNIT BITS) and combine them with last 5 bits from second payload byte (5 NAL UNIT BITS) so you would get a byte like this [3 NAL UNIT BITS | 5 NAL UNIT BITS]. Then write that NAL byte first into a clear buffer with all other following bytes from that fragment. Remember to skip first byte in a sequence since it is not a part of IDR, but only identifies the fragment.

If start_bit and end_bit are 0 then just write the payload (skipping first payload byte that identifies the fragment) to the buffer.

If start_bit is 0 and end_bit is 1, that means that it is the last fragment, and you just write its payload (skipping the first byte that identifies the fragment) to the buffer, and now you have your IDR reconstructed.

If you need some code, just ask in comment, I'll post it, but I think this is pretty clear how to do... =)

CONCERNING THE DECODING

It crossed my mind today why you get error on decoding the IDR (I presumed that you have reconstructed it good). How are you building your AVC Decoder Configuration Record? Does the lib that you use have that automated? If not, and you havent heard of this, continue reading...

AVCDCR is specified to allow decoders to quickly parse all the data they need to decode H264 (AVC) video stream. And the data is following:

  • ProfileIDC
  • ProfileIOP
  • LevelIDC
  • SPS (Sequence Parameter Sets)
  • PPS (Picture Parameter Sets)

All this data is sent in RTSP session in SDP under the fields: profile-level-id and sprop-parameter-sets.

DECODING PROFILE-LEVEL-ID

Prifile level ID string is divided into 3 substrings, each 2 characters long:

[PROFILE IDC][PROFILE IOP][LEVEL IDC]

Each substring represents one byte in base16! So, if Profile IDC is 28, that means it is actualy 40 in base10. Later you will use base10 values to construct AVC Decoder Configuration Record.

DECODING SPROP-PARAMETER-SETS

Sprops are usualy 2 strings (could be more) that are comma separated, and base64 encoded! You can decode both of them but there is no need to. Your job here is just to convert them from base64 string into byte array for later use. Now you have 2 byte arrays, first array us SPS, second one is PPS.

BUILDING THE AVCDCR

Now, you have all you need to build AVCDCR, you start by making new clean buffer, now write these things in it in the order explained here:

1 - Byte that has value 1 and represents version

2 - Profile IDC byte

3 - Prifile IOP byte

4 - Level IDC byte

5 - Byte with value 0xFF (google the AVC Decoder Configuration Record to see what this is)

6 - Byte with value 0xE1

7 - Short with value of the SPS array length

8 - SPS byte array

9 - Byte with the number of PPS arrays (you could have more of them in sprop-parameter-set)

10 - Short with the length of following PPS array

11 - PPS array

DECODING VIDEO STREAM

Now you have byte array that tells the decoder how to decode H264 video stream. I believe that you need this if your lib doesn't build it itself from SDP...

Cipi
This library can build it itself but I build it myself.With ffmpeg, this parameter are stored in a structure (AvCodecContext). I will try building ACDR with your method.thx
bben
Ok, then you are not reconstructing IDR like you should... check once more the process. Hope I helped... =)
Cipi
It's good: ACDR is recognized by the decoder and parameters are set. Decoder does not decode the rest but it is due to another parameters of ffmpeg I think. I thank you for your help : I have already made significant progress.
bben
A: 

Any comment from the ffmpeg mailing list ?

I checked several message about depaketization process of ffmpeg-user-list but I did'nt found any information which can help me.

Why 0x000001? This is h264 not MPEG4

Start code is the same

I don't have enough karma to comment on your question or answer below, but are you appending the NALU startcode before to EACH 'fragment'?

I'm appending the NALU startcode before to each fragment which are "start" of a NALU. Therefore each NALU have a startcode.

Sorry I don't know why I can't edit my question and comment some of your answers : >I must answer


Thx CIpi for your answer!

I did what you said but I get error with decoder (ffmpeg with the function avcodec_decode_video ).

The error concerns the SPS.

Which couple of SPS/PPS must I use to decode? SDP or SPS/PPS packet or both ?

If you know ffmpeg, the error message is : sps_id (32) out of range.

When I parse SDP, I get SPS/PPS and build a NALU with them (forbidden Bits=0, NRI= 01 ,nal_type= 7 or 8) and I send it to ffmpeg.

Then when I receive a NALU with nal_type= 7 or 8, I send it (without RTP header) to ffmpeg.

What am I doing wrong again

bben
To start decoding you need SDP-s profile-level-id and sprop-parameter-sets, so you must pass that to decoder (dont know how). Sprops look like this: `Z0IAKeNQFAe2AtwEBAaQeJEV,aM48gA==`, and profile level id like this: `420029`. Profile level is madeup of 3 HEX values that are ProfileIDC, ProfileIOP, LevelIDC. Sprops are 2 base64 encoded strings, comma separated. First one is sequence parameter set, and the other is picture parameter set. To use them I believe that you must convert them to base10 (original state).
Cipi
I get : profileIDC=42 ; levelIDC =29 ; ProfileIOP=0.Is this good value?Do I update this value during streaming ?
bben
Well pay attention. That is in HEX (base16) so you must make it base10 to use it: profileIDC=66, levelIDC=41. No, you don't have to update anything. Decoder will always use these values because they don't change.
Cipi