tags:

views:

538

answers:

12

I want to create my own time stamp data structure in C.

DAY ( 0 - 31 ), HOUR ( 0 - 23 ), MINUTE ( 0 - 59 )

What is the smallest data structure possible?

+5  A: 

Well, you could pack it all in an unsigned short (That's 2 bytes, 5 bits for Day, 5 bits for hour, 6 bits for minute)... and use some shifts and masking to get the values.

unsigned short timestamp = <some value>; // Bits: DDDDDHHHHHMMMMMM

int day = (timestamp >> 11) & 0x1F;
int hour = (timestamp >> 6) & 0x1F;
int min = (timestamp) & 0x3F;

unsigned short dup_timestamp = (short)((day << 11) | (hour << 6) | min);

or using macros

#define DAY(x)    (((x) >> 11) & 0x1F)
#define HOUR(x)   (((x) >> 6)  & 0x1F)
#define MINUTE(x) ((x)         & 0x3F)
#define TIMESTAMP(d, h, m) ((((d) & 0x1F) << 11) | (((h) & 0x1F) << 6) | ((m) & 0x3F)

(You didn't mention month/year in your current version of the question, so I've omitted them).

[Edit: use unsigned short - not signed short.]

Daniel LeCheminant
Use bitfields instead of shift and mask
mpez0
@mpez0: only if you like slow, bloated code.
Jonathan Leffler
@mpez0: shifts and masks are portable; for bitfields, the size of the struct which contains the bitfield is implementation-dependent, as is the order of the bits. @Jonathan: depends on the platform; for some embedded processors and compilers, the compiled code is the same.
Steve Melnikoff
@Jonathan Leffler: Thanks for fixing that ;]
Daniel LeCheminant
+2  A: 
  • Month: range 1 - 12 => 4 bits
  • Date: range 1 - 31 => 5 bits
  • Hour: range 0 - 24 => 5 bits
  • Minute: range 0 - 60 => 6 bits


  • Total: 20 bits

You can use a bitfield and use a compiler/platform specific pragma to keep it tight:

typedef struct packed_time_t {
    unsigned int month  : 4;
    unsigned int date   : 5;
    unsigned int hour   : 5;
    unsigned int minute : 6;
} packed_time_t;

But do you really need this? Wouldn't the standard time functions be enough? Bitfields vary depending on architecture, padding and so on ... not a portable construct.

dirkgently
This timestamp has to be stored in a message, which has to be as small as possible.
Justin Tanner
+2  A: 

For the use case you describe, (minute resolution of times in a 31 day range) I'd just use a 16-bit minute counter. If you're serializing this data (to disk, network) then you can use some variable length integer encoding to save bytes for small values.

slacy
+4  A: 

Do you mean HOUR 0-23 and MINUTE 0-59? I've heard of leap seconds but not leap minutes or hours.

(log (* 31 60 24) 2)
=> 15.446

So you can fit these values 16 bits, or 2 bytes. Whether this is a good idea or not is a completely different question.

Ken
A: 

5 bits for the day plus 5 bits for the hour plus 6 bits for the minute equals an unsigned short. Any further packing would not reduce the storage space required and would increase code complexity and cpu usage.

Les
+1  A: 

In general you can compute this answer as follows (where log2 is the base 2 logarithm, i.e. the number of bits):

  • If you want to use shifts and masks to get the data in and out, take log2() of the number of possible values for each field, round up (to get bits), add the results (to get total bits), divide by eight (total bytes, w. fractional bytes), and round up again (total bytes).

    log2(60) + log2(60) + log2(24) + log2(31) + log2(12) = 6+6+5+5+4 = 26 bits = 4 bytes

  • If you want to get the fields in and out by multiplying & adding / dividing & modulo, multiply together the number of possible values for each field and take log2() of that, divide by eigth, and round up.

    log2(60*60*24*31*12) = 24.9379 bits = 4 bytes

  • You can save a tiny additional amount of space by combining non-isoformal fields (e.g. storing day of year rather than month and day of month) but it is seldom worth it.

    log2(60*60*24*366) = 24.91444 bits = 4 bytes

-- MarkusQ "teach a man to fish"

MarkusQ
A: 

Note: The original question has been edited, and the month is no longer necessary. The original calculations were below:

It's simply a matter of how much computation you want to do. The tightest way to pack it is if you can make your own type, and use the following math to convert from and to its corresponding integer:

Valid ranges are:

Month: 1-12 -> (0-11)+1
Day: 1-31 -> (0-30)+1
Hour: 0-24
Minute: 0-60

You can choose an order to store the values in (I'll keep it in the above order).

Month-1 Day-1  Hour   Minute
(0-11)  (0-30) (0-23) (0-59)
Do a bit of multiplication/division to convert the values using the following formula as a guide:
value = (((Month - 1) * 31 + (Day - 1)) * 24 + Hour) * 60 + Minute

So, you have the minimum value 0 and the maximum value ((11*31+30)*24+23)*60+59, which is 535,679. So you need 20 bits minimum to store this value as an unsigned integer (2^20-1 = 1,048,575; 2^19-1 = 524,287).

If you want to make things dificult but save a byte, you can use 3 bytes and manipulate them yourself. Or you can use an int (32-bit) and work with it normally using simple math operators.

BUT There's some room to play with there though, so let's see if we can make this easier:

Valid ranges are, again:

Month: 1-12 -> (0-11)+1 --- 4 bits (you don't even need the -1)
Day: 1-31 -> (0-30)+1   --- 5 bits (you again don't need the -1) 
Hour: 0-24              --- 5 bits
Minute: 0-60            --- 6 bits

That's a total of 20 bits, and really easy to manipulate. So you don't gain anything by compacting any further than using simple bit-shifting, and you can store the value like this:

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
---Month--- ----Day------- ---Hour--- --Minute---


If you don't care about the month, the tightest you can get is:

value = ((Day - 1) * 24 + Hour) * 60 + Minute

leaving you with a range of 0 to 44,639 which can fit neatly in a 16-bit short.

There's some room to play with there though, so let's see if we can make this easier:

Valid ranges are, again:

Day: 1-31 -> (0-30)+1 --- 5 bits (you don't even need the -1) 
Hour: 0-24            --- 5 bits
Minute: 0-60          --- 6 bits

That's a total of 16 bits, and again really easy to manipulate. So....store the value like this:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
----Day------- ---Hour--- --Minute---
lc
Yes, yes I do. Thanks.
lc
You have an extra bit on the "month" field. Only 16 to 19 bits is needed.
Casey
+1  A: 

just to offer an alternative:

  • if you only need minute-level resolution,
  • and you don't cross date boundaries (month/year)
  • and your messages are sequential with guaranteed delivery

then you can store the timestamp as an offset from the timestamp of the last message.

In this case, you only need enough bits to hold the maximum number of minutes between messages. For example, if you emit messages at most 255 minutes apart, then one byte will suffice.

Note, however, that the very first message may need to include an absolute timestamp in its payload, for synchronization.

[i'm not saying this is a good solution - it's fairly fragile and makes a lot of assumptions - just an alternative one]

Steven A. Lowe
A: 

Well, disregarding the superfluous HOUR 24 and MINUTE 60, we have 31 x 24 x 60 = 44,640 possible unique time values. 2^15 = 32,768 < 44,640 < 65,536 = 2^16 so we'll need at least 16 bits (2 bytes) to represent these values.

If we don't want to be doing modulo arithmetic to access the values each time, we need to be sure to store each in its own bit field. We need 5 bits to store the DAY, 5 bits to store the HOUR, and 6 bits to store the MINUTE, which still fits in 2 bytes:

struct day_hour_minute {
  unsigned char DAY:5; 
  unsigned char HOUR:5;
  unsigned char MINUTE:6;
};

Including the MONTH would increase our unique time values by a factor of 12, giving 535,680 unique values, which would require at least 20 bits to store (2^19 = 524,288 < 535,680 < 1,048,576 = 2^20), which requires at least 3 bytes.

Again, to avoid modulo arithmetic, we need a separate bit field for MONTH, which should only require 4 bits:

struct month_day_hour_minute {
  unsigned char MONTH:4;
  unsigned char DAY:5;
  unsigned char HOUR:5;
  unsigned char MINUTE:6;
  unsigned char unused: 4;
};

In both of these examples however, be aware that C prefers its data structures be on-cut - that is, that they are multiples of 4 or 8 bytes (usually), so it may pad your data structures beyond what is minimally necessary.

For example, on my machine,

#include <stdio.h>

struct day_hour_minute {
  unsigned int DAY:5;
  unsigned int HOUR:5;
  unsigned int MINUTE:6;
};
struct month_day_hour_minute {
  unsigned int MONTH:4;
  unsigned int DAY:5;
  unsigned int HOUR:5;
  unsigned int MINUTE:6;
  unsigned int unused: 4;
};

#define DI( i ) printf( #i " = %d\n", i )
int main(void) {
  DI( sizeof(struct day_hour_minute) );
  DI( sizeof(struct month_day_hour_minute) );
  return 0;
}

prints:

sizeof(struct day_hour_minute) = 4
sizeof(struct month_day_hour_minute) = 4
rampion
+1  A: 

60 Minutes/Hour means you'd need at least 6 bits to store the minute (since 59th minute == 111011b), while 24 Hours/Day means another 5 bits (23rd hour == 10111b). If you want to account for any of the (possibly) 366 Days/Year, you'd need 9 more bits (366th day (365 when day 1 == 0) == 101101101b). So if you wanted to store everything in a purely accessible format, you'd need 20 bits == 3 Bytes. Alternatively, adding a Month field would make the total possible Days value go from 366 to 31 -- down to 5 bits, with 4 more bits for the month. This would also give you 20 bits, or 3 bytes with 4 bits to spare.

Conversely, if you kept track of the date just by minutes from some start date, 3 bytes would give you a resolution of 16,777,215 minutes before you rolled over to 0 again -- that's about 279,620 hours, 11,650 days, and about 388 months, and that's using all 24 bits. That's probably a better way to go, if you don't care about seconds, and if you don't mind taking a little bit of execution time to interpret the hour, day and month. And this would be much easier to increment!

Joe
+2  A: 

Why not just use the (4-byte?) output of the C time() function with NULL as an argument. It's just the Unix epoch time (i.e. the number of seconds since January 1st, 1970). Like Joe's answer, it gives you much more room to grow than any answer that tries to pack in months and days and years into bits. It's standard. Converting the time_t variable to an actual time is trivial in standard C (on Unix, at least) and most of the time, if you have a data structure intended to hold a 3 byte variable, it may be rounded up to 4 bytes anyway.

I know you're trying to optimize heavily for size, but 4 bytes is pretty damn small. Even if you truncate off the top byte, you still get 194 days of distinct times out of it.

You can get even more out of this by taking the time from time(NULL) and dividing it by 60 before storing it, truncating it to a minute and storing that. 3 bytes of that gives you, as shown above, 388 months, and for 2 bytes you can store 45 days.

I would go with the 4-byte version, simply because I don't see the difference between 2, 3 and 4 bytes as being at all significant or vital to any program running or not (unless it's a bootloader). It's simpler to get and simpler to handle, and will probably save you many headaches in the end.

EDIT: The code I posted didn't work. I've had 3 hours of sleep and I'll figure out how to do the bit-twiddling correctly eventually. Until then, you can implement this yourself.

Chris Lutz
A: 

To simplify this without loss of generality,

Day (0 - 30), Hour (0 - 23), Minute (0 - 59)

encoding = Day + (Hour + (Minute)*24)*31

Day = encoding %31
Hour = (encoding / 31) % 24
Minute = (encoding / 31) / 24

The maximum value of encoding is 44639 which is slightly less than 16 bits.

Edit: rampion said basically the same thing. And this gets you the minimal representation, which is less than the bitwise interleaving representation.

MSN
Exactly, but you actually don't have to do that (it doesn't gain you an extra bit). You can get exactly 16 bits using a bit-shifting and don't have to do modular arithmetic. See my answer above.
lc
Oh, I'm not arguing with the bitpacking method; but it's good to also know how to compact a set of numbers into its most compact form. And if this were a question about compressing chess moves, you'd want to use other techniques too :)
MSN