Look at the definitions for RTP and RTCP packets in RFC 3550:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
I won't reproduce the legend for all of the above - it's quite long - but take a look at Section 5.1.
With that in hand you'll see there's not a lot you can do to determine if a packet contains RTP/RTCP. Best of all would be to sniff, as other posters have suggested, the media stream negotiation. Second best would be some sort've pattern matching over a sequence of packets: the first two bits will be 10, followed by the next two bits being constant, followed by bits 9 through 15 being constant, then 16 -> 31 incrementing, and so on.