I want to create my own time stamp data structure in C.
DAY ( 0 - 31 ), HOUR ( 0 - 23 ), MINUTE ( 0 - 59 )
What is the smallest data structure possible?
I want to create my own time stamp data structure in C.
DAY ( 0 - 31 ), HOUR ( 0 - 23 ), MINUTE ( 0 - 59 )
What is the smallest data structure possible?
Well, you could pack it all in an unsigned short
(That's 2 bytes, 5 bits for Day, 5 bits for hour, 6 bits for minute)... and use some shifts and masking to get the values.
unsigned short timestamp = <some value>; // Bits: DDDDDHHHHHMMMMMM
int day = (timestamp >> 11) & 0x1F;
int hour = (timestamp >> 6) & 0x1F;
int min = (timestamp) & 0x3F;
unsigned short dup_timestamp = (short)((day << 11) | (hour << 6) | min);
or using macros
#define DAY(x) (((x) >> 11) & 0x1F)
#define HOUR(x) (((x) >> 6) & 0x1F)
#define MINUTE(x) ((x) & 0x3F)
#define TIMESTAMP(d, h, m) ((((d) & 0x1F) << 11) | (((h) & 0x1F) << 6) | ((m) & 0x3F)
(You didn't mention month/year in your current version of the question, so I've omitted them).
[Edit: use unsigned short
- not signed short
.]
You can use a bitfield and use a compiler/platform specific pragma to keep it tight:
typedef struct packed_time_t {
unsigned int month : 4;
unsigned int date : 5;
unsigned int hour : 5;
unsigned int minute : 6;
} packed_time_t;
But do you really need this? Wouldn't the standard time functions be enough? Bitfields vary depending on architecture, padding and so on ... not a portable construct.
For the use case you describe, (minute resolution of times in a 31 day range) I'd just use a 16-bit minute counter. If you're serializing this data (to disk, network) then you can use some variable length integer encoding to save bytes for small values.
Do you mean HOUR 0-23 and MINUTE 0-59? I've heard of leap seconds but not leap minutes or hours.
(log (* 31 60 24) 2)
=> 15.446
So you can fit these values 16 bits, or 2 bytes. Whether this is a good idea or not is a completely different question.
5 bits for the day plus 5 bits for the hour plus 6 bits for the minute equals an unsigned short. Any further packing would not reduce the storage space required and would increase code complexity and cpu usage.
In general you can compute this answer as follows (where log2 is the base 2 logarithm, i.e. the number of bits):
If you want to use shifts and masks to get the data in and out, take log2() of the number of possible values for each field, round up (to get bits), add the results (to get total bits), divide by eight (total bytes, w. fractional bytes), and round up again (total bytes).
log2(60) + log2(60) + log2(24) + log2(31) + log2(12) = 6+6+5+5+4 = 26 bits = 4 bytes
If you want to get the fields in and out by multiplying & adding / dividing & modulo, multiply together the number of possible values for each field and take log2() of that, divide by eigth, and round up.
log2(60*60*24*31*12) = 24.9379 bits = 4 bytes
You can save a tiny additional amount of space by combining non-isoformal fields (e.g. storing day of year rather than month and day of month) but it is seldom worth it.
log2(60*60*24*366) = 24.91444 bits = 4 bytes
-- MarkusQ "teach a man to fish"
Note: The original question has been edited, and the month is no longer necessary. The original calculations were below:
It's simply a matter of how much computation you want to do. The tightest way to pack it is if you can make your own type, and use the following math to convert from and to its corresponding integer:
Valid ranges are:
Month: 1-12 -> (0-11)+1 Day: 1-31 -> (0-30)+1 Hour: 0-24 Minute: 0-60
You can choose an order to store the values in (I'll keep it in the above order).
Month-1 Day-1 Hour Minute (0-11) (0-30) (0-23) (0-59)Do a bit of multiplication/division to convert the values using the following formula as a guide:
value = (((Month - 1) * 31 + (Day - 1)) * 24 + Hour) * 60 + Minute
So, you have the minimum value 0 and the maximum value ((11*31+30)*24+23)*60+59
, which is 535,679. So you need 20 bits minimum to store this value as an unsigned integer (2^20-1 = 1,048,575; 2^19-1 = 524,287
).
If you want to make things dificult but save a byte, you can use 3 bytes and manipulate them yourself. Or you can use an int (32-bit) and work with it normally using simple math operators.
BUT There's some room to play with there though, so let's see if we can make this easier:
Valid ranges are, again:
Month: 1-12 -> (0-11)+1 --- 4 bits (you don't even need the -1) Day: 1-31 -> (0-30)+1 --- 5 bits (you again don't need the -1) Hour: 0-24 --- 5 bits Minute: 0-60 --- 6 bits
That's a total of 20 bits, and really easy to manipulate. So you don't gain anything by compacting any further than using simple bit-shifting, and you can store the value like this:
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ---Month--- ----Day------- ---Hour--- --Minute---
If you don't care about the month, the tightest you can get is:
value = ((Day - 1) * 24 + Hour) * 60 + Minute
leaving you with a range of 0 to 44,639 which can fit neatly in a 16-bit short
.
There's some room to play with there though, so let's see if we can make this easier:
Valid ranges are, again:
Day: 1-31 -> (0-30)+1 --- 5 bits (you don't even need the -1) Hour: 0-24 --- 5 bits Minute: 0-60 --- 6 bits
That's a total of 16 bits, and again really easy to manipulate. So....store the value like this:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----Day------- ---Hour--- --Minute---
just to offer an alternative:
then you can store the timestamp as an offset from the timestamp of the last message.
In this case, you only need enough bits to hold the maximum number of minutes between messages. For example, if you emit messages at most 255 minutes apart, then one byte will suffice.
Note, however, that the very first message may need to include an absolute timestamp in its payload, for synchronization.
[i'm not saying this is a good solution - it's fairly fragile and makes a lot of assumptions - just an alternative one]
Well, disregarding the superfluous HOUR 24 and MINUTE 60, we have 31 x 24 x 60 = 44,640 possible unique time values. 2^15 = 32,768 < 44,640 < 65,536 = 2^16 so we'll need at least 16 bits (2 bytes) to represent these values.
If we don't want to be doing modulo arithmetic to access the values each time, we need to be sure to store each in its own bit field. We need 5 bits to store the DAY, 5 bits to store the HOUR, and 6 bits to store the MINUTE, which still fits in 2 bytes:
struct day_hour_minute {
unsigned char DAY:5;
unsigned char HOUR:5;
unsigned char MINUTE:6;
};
Including the MONTH would increase our unique time values by a factor of 12, giving 535,680 unique values, which would require at least 20 bits to store (2^19 = 524,288 < 535,680 < 1,048,576 = 2^20), which requires at least 3 bytes.
Again, to avoid modulo arithmetic, we need a separate bit field for MONTH, which should only require 4 bits:
struct month_day_hour_minute {
unsigned char MONTH:4;
unsigned char DAY:5;
unsigned char HOUR:5;
unsigned char MINUTE:6;
unsigned char unused: 4;
};
In both of these examples however, be aware that C prefers its data structures be on-cut - that is, that they are multiples of 4 or 8 bytes (usually), so it may pad your data structures beyond what is minimally necessary.
For example, on my machine,
#include <stdio.h>
struct day_hour_minute {
unsigned int DAY:5;
unsigned int HOUR:5;
unsigned int MINUTE:6;
};
struct month_day_hour_minute {
unsigned int MONTH:4;
unsigned int DAY:5;
unsigned int HOUR:5;
unsigned int MINUTE:6;
unsigned int unused: 4;
};
#define DI( i ) printf( #i " = %d\n", i )
int main(void) {
DI( sizeof(struct day_hour_minute) );
DI( sizeof(struct month_day_hour_minute) );
return 0;
}
prints:
sizeof(struct day_hour_minute) = 4
sizeof(struct month_day_hour_minute) = 4
60 Minutes/Hour means you'd need at least 6 bits to store the minute (since 59th minute == 111011b), while 24 Hours/Day means another 5 bits (23rd hour == 10111b). If you want to account for any of the (possibly) 366 Days/Year, you'd need 9 more bits (366th day (365 when day 1 == 0) == 101101101b). So if you wanted to store everything in a purely accessible format, you'd need 20 bits == 3 Bytes. Alternatively, adding a Month field would make the total possible Days value go from 366 to 31 -- down to 5 bits, with 4 more bits for the month. This would also give you 20 bits, or 3 bytes with 4 bits to spare.
Conversely, if you kept track of the date just by minutes from some start date, 3 bytes would give you a resolution of 16,777,215 minutes before you rolled over to 0 again -- that's about 279,620 hours, 11,650 days, and about 388 months, and that's using all 24 bits. That's probably a better way to go, if you don't care about seconds, and if you don't mind taking a little bit of execution time to interpret the hour, day and month. And this would be much easier to increment!
Why not just use the (4-byte?) output of the C time()
function with NULL
as an argument. It's just the Unix epoch time (i.e. the number of seconds since January 1st, 1970). Like Joe's answer, it gives you much more room to grow than any answer that tries to pack in months and days and years into bits. It's standard. Converting the time_t
variable to an actual time is trivial in standard C (on Unix, at least) and most of the time, if you have a data structure intended to hold a 3 byte variable, it may be rounded up to 4 bytes anyway.
I know you're trying to optimize heavily for size, but 4 bytes is pretty damn small. Even if you truncate off the top byte, you still get 194 days of distinct times out of it.
You can get even more out of this by taking the time from time(NULL)
and dividing it by 60 before storing it, truncating it to a minute and storing that. 3 bytes of that gives you, as shown above, 388 months, and for 2 bytes you can store 45 days.
I would go with the 4-byte version, simply because I don't see the difference between 2, 3 and 4 bytes as being at all significant or vital to any program running or not (unless it's a bootloader). It's simpler to get and simpler to handle, and will probably save you many headaches in the end.
EDIT: The code I posted didn't work. I've had 3 hours of sleep and I'll figure out how to do the bit-twiddling correctly eventually. Until then, you can implement this yourself.
To simplify this without loss of generality,
Day (0 - 30), Hour (0 - 23), Minute (0 - 59)
encoding = Day + (Hour + (Minute)*24)*31
Day = encoding %31
Hour = (encoding / 31) % 24
Minute = (encoding / 31) / 24
The maximum value of encoding is 44639 which is slightly less than 16 bits.
Edit: rampion said basically the same thing. And this gets you the minimal representation, which is less than the bitwise interleaving representation.