I need to detect MPEG4 I-Frame in RTP packet. I know how to remove RTP header and get the MPEG4 frame in it, but I cant figure out how to identify the I-Frame.
Does it have a specific signature/header?
I need to detect MPEG4 I-Frame in RTP packet. I know how to remove RTP header and get the MPEG4 frame in it, but I cant figure out how to identify the I-Frame.
Does it have a specific signature/header?
Ok so I figured it out for h264 stream.
How to detect I-Frame:
I cant figure it out for the MPEG4-ES stream... any suggestions?
EDIT: This works for my h264 stream (fmtp:96 packetization-mode=1; profile-level-id=420029;
). You just pass byte array that represents the h264 fragment received through RTP. If you want to pass whole RTP, just correct the RTPHeaderBytes
value to skip RTP header. I always get the I-Frame, because it is the only frame that can be fragmented, see here. I use this (simplified) piece of code in my server, and it works like a charm!!!! If the I-Frame (IDR) is not fragmented, the fragment_type
would be 5, so this code would return true
for the fragmented and not fragmented IDRs.
public static bool isH264iFrame(byte[] paket)
{
int RTPHeaderBytes = 0;
int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
int start_bit = paket[RTPHeaderBytes + 1] & 0x80;
if (((fragment_type == 28 || fragment_type == 29) && nal_type == 5 && start_bit == 128) || fragment_type == 5)
{
return true;
}
return false;
}
Here's the table of NAL unit types:
Type Name
0 [unspecified]
1 Coded slice
2 Data Partition A
3 Data Partition B
4 Data Partition C
5 IDR (Instantaneous Decoding Refresh) Picture
6 SEI (Supplemental Enhancement Information)
7 SPS (Sequence Parameter Set)
8 PPS (Picture Parameter Set)
9 Access Unit Delimiter
10 EoS (End of Sequence)
11 EoS (End of Stream)
12 Filter Data
13-23 [extended]
24-31 [unspecified]
As far as I know, MPEG4-ES stream fragments in RTP payload usually start with MPEG4 startcode, which can be one of these:
0x000001b0
: visual_object_sequence_start_code (probably keyframe)0x000001b6
: vop_start_code (keyframe, if the next two bits are zero)0x000001b3
: group_of_vop_start_code, which contains three bytes and then hopefully a vop_start_code that may or may not belong to a keyframe (see above)0x00000120
: video_object_layer_start_code (probably keyframe)I'm afraid that you'll need to parse the stream to be sure :-/
Actually, you was correct for h264 stream, if the NAL value (first byte) is 0x7C
it means that the I-Frame is fragmented. No other frames (P and B) can be fragmented, so if there is packetization-mode=1
in SDP
, then it means that the I-Frames are fragmented, and therefore if you read 0x7C
as first byte, then it is I-Frame. Read more here: http://www.rfc-editor.org/rfc/rfc3984.txt.