I'd like to screen some jpegs for validity before I send them across the network for more extensive inspection. It is easy enough to check for a valid header and footer, but what is the smallest size (in bytes) a valid jpeg could be?
Why you need size to test for jpeg validity? instead why not just read jpeg header to see if it is ok?
Here's the C++ routine I wrote to do this:
bool is_jpeg(const unsigned char* img_data, size_t size)
{
return img_data &&
(size >= 10) &&
(img_data[0] == 0xFF) &&
(img_data[1] == 0xD8) &&
((memcmp(img_data + 6, "JFIF", 4) == 0) ||
(memcmp(img_data + 6, "Exif", 4) == 0));
}
img_data
points to a buffer containing the JPEG data.
I'm sure you need more bytes to have a JPEG that will decode to a useful image, but it's a fair bet that if the first 10 bytes pass this test, the buffer probably contains a JPEG.
EDIT: You can, of course, replace the 10 above with a higher value once you decide on one. 134, as suggested in another answer, for example.
The smallest I could create under Ms-paint was 1x1 jpeg which resulted in 631 bytes...
Hope this helps, Best regards, Tom.
The first two bytes should tell you if it's a JPG; 0xFFD8.
And as far as I know the SOF markers can contain the dimensions.
A 1x1 grey pixel in 125 bytes using arithmetic coding, still in the JPEG standard even if most decoders can't decode it :
ff d8 : SOI
ff e0 ; APP0
00 10
4a 46 49 46 00 01 01 01 00 48 00 48 00 00
ff db ; DQT
00 43
00
03 02 02 02 02 02 03 02
02 02 03 03 03 03 04 06
04 04 04 04 04 08 06 06
05 06 09 08 0a 0a 09 08
09 09 0a 0c 0f 0c 0a 0b
0e 0b 09 09 0d 11 0d 0e
0f 10 10 11 10 0a 0c 12
13 12 10 13 0f 10 10 10
ff c9 ; SOF
00 0b
08 00 01 00 01 01 01 11 00
ff cc ; DAC
00 06 00 10 10 05
ff da ; SOS
00 08
01 01 00 00 3f 00 d2 cf 20
ff d9 ; EOI
I don't think the mentioned 134 byte example is standard, as it is missing an EOI. All decoders will handle this but the standard says it should end with one.
It is not a requirement that JPEGs contain either a JFIF or Exif marker. But they must start with FF D8, and they must have a marker following that, so you can check for FF D8 FF.
While I realize this is far from the smallest valid jpeg and has little or nothing to do with your actual question, I felt I should share this as I'd been looking for a very small JPEG that actually looked like something to do some testing with when i'd found your question. I'm sharing it here because its valid, its small, and it makes me ROFL.
Here is a 384 byte JPEG image that I made in photoshop. It is the letters ROFL hand drawn by me and then saved with max compression settings while still being sort of readable.
Hex sequences:
my @image_hex = qw{ FF D8 FF E0 00 10 4A 46 49 46 00 01 02 00 00 64 00 64 00 00 FF EC 00 11 44 75 63 6B 79 00 01 00 04 00 00 00 00 00 00 FF EE 00 0E 41 64 6F 62 65 00 64 C0 00 00 00 01 FF DB 00 84 00 1B 1A 1A 29 1D 29 41 26 26 41 42 2F 2F 2F 42 47 3F 3E 3E 3F 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 01 1D 29 29 34 26 34 3F 28 28 3F 47 3F 35 3F 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 FF C0 00 11 08 00 08 00 19 03 01 22 00 02 11 01 03 11 01 FF C4 00 61 00 01 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 04 02 05 01 01 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 04 10 00 02 02 02 02 03 01 00 00 00 00 00 00 00 00 00 01 02 11 03 00 41 21 12 F0 13 04 31 11 00 01 04 03 00 00 00 00 00 00 00 00 00 00 00 00 00 21 31 61 71 B1 12 22 FF DA 00 0C 03 01 00 02 11 03 11 00 3F 00 A1 7E 6B AD 4E B6 4B 30 EA E0 19 82 39 91 3A 6E 63 5F 99 8A 68 B6 E3 EA 70 08 A8 00 55 98 EE 48 22 37 1C 63 19 AF A5 68 B8 05 24 9A 7E 99 F5 B3 22 20 55 EA 27 CD 8C EB 4E 31 91 9D 41 FF D9 }; #this is a very tiny jpeg. it is a image representaion of the letters "ROFL" hand drawn by me in photoshop and then saved at the lowest possible quality settings where the letters could still be made out :)
my $image_data = pack('H2' x scalar(@image_hex), @image_hex); my $url_escaped_image = uri_escape( $image_data );
URL escaped binary image data (can paste right into a URL)
%FF%D8%FF%E0%00%10JFIF%00%01%02%00%00d%00d%00%00%FF%EC%00%11Ducky%00%01%00%04%00%00%00%00%00%00%FF%EE%00%0EAdobe%00d%C0%00%00%00%01%FF%DB%00%84%00%1B%1A%1A)%1D)A%26%26AB%2F%2F%2FBG%3F%3E%3E%3FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%01%1D))4%264%3F((%3FG%3F5%3FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG%FF%C0%00%11%08%00%08%00%19%03%01%22%00%02%11%01%03%11%01%FF%C4%00a%00%01%01%01%01%00%00%00%00%00%00%00%00%00%00%00%00%00%04%02%05%01%01%01%01%00%00%00%00%00%00%00%00%00%00%00%00%00%00%02%04%10%00%02%02%02%02%03%01%00%00%00%00%00%00%00%00%00%01%02%11%03%00A!%12%F0%13%041%11%00%01%04%03%00%00%00%00%00%00%00%00%00%00%00%00%00!1aq%B1%12%22%FF%DA%00%0C%03%01%00%02%11%03%11%00%3F%00%A1~k%ADN%B6K0%EA%E0%19%829%91%3Anc_%99%8Ah%B6%E3%EAp%08%A8%00U%98%EEH%227%1Cc%19%AF%A5h%B8%05%24%9A~%99%F5%B3%22%20U%EA'%CD%8C%EBN1%91%9DA%FF%D9