tags:

views:

233

answers:

8

I need to store dates such as 'Summer 1878' or 'Early June 1923', or even 'Mid-afternoon on a Tuesday in August'. How would you suggest I do this?

I have considered breaking the date and time up into separate (integer) columns, and giving each column an ancillary (integer) column containing a range (0 if exact; NULL if unknown). But I'm sure there's other ways...

Thanks!

+1  A: 

I'm using Postgres, and I wanted to do the same thing. Perhaps you can do it the same way as I did it, if MySQL has some similar geometric types: http://www.electricwords.org/2008/11/fuzzy-date-matching-in-postgresql/

Dan Ellis
A: 

Almost no matter what you do, you almost certainly won't be able to get the database to do the heavy lifting for you. So you are left with two options: 1 - Use natural strings as you have described 2 - Store a precise data as well as the precision of that date

For example, you could store "5:10:23pm on Sep 23,1975", "plus or minus 6 months", and when someone wants to search for records that occured in that timeframe this could pop up.

This doesn't help with queries, because to the best of my knowledge MySQL doesn't provide any support for tolerances ( nor do any others I know of ). You have to basically query it all and then filter out yourself.

Chris Arguin
A: 

create a table with a list of values that you could want, like "Early" or "Summer". then whatever you have setting up the data could have an algorithm that sets a foreign key depending on the date.

DForck42
I think a list of human terms for date ranges will be exceptionally useful for users selecting dates (but the underlying stored dates I think shall be hard-and-fast datetimes).
Sam
A: 

Going with Chris Arguin's answer, in the second column just have another datetime column that you can use to store the +/-, then you should be able to write a query that uses both columns to get an approximate datetime.

James Black
A: 

Use two dates and determine the start and end date of the fuzzy region. For stuff like Summer 1878, enter 18780621 to 18780920. For Early June 1923 you have to decide when early ends, maybe 19230601 to 19230610. This makes it possible to select against the values. You might still want to filter afterward but this will get you close.

For the ones without years, you'll have to find a different system.

jmucchiello
A: 

Since 'Mid-afternoon on a Tuesday in August' ("A Sunday Afternoon on the Island of La Grande Jatte"?) doesn't specify a year, the only real solution is your table of all date and time components, all nullable.

Other wise, you're conflating your data.

You have two (admittedly related) things here: a human readable string, the date_description, and a range of possible dates.

If you can specify at least a range, you can do this:

create table artwork {
  artwork_id int not null primary key,
  name varchar(80),
  ... other columns
  date_description varchar(80),
  earliest_possible_creation_date datetime
  latest_possible_creation_date datetime
}

insert into artwork( 
  name, 
  date_description, 
  earliest_possible_creation_date, 
  latest_possible_creation_date
) values ( 

  'A Sunday Afternoon on the Island of La Grande Jatte',
  'Mid-afternoon on a Tuesday in August'
  '1884-01-01',
  '1886-12-31'
), (
  'Blonde Woman with Bare Breasts',
  'Summer 1878'
  '1878-05-01',
  '1878-08-31'
), (
   'Paulo on a Donkey',
   'Early June 1923',
   '1923-06-01'
   '1923-06-15'
);

This allows you to display whatever you want, and search for:

select * from artwork 
where @some_date between 
earliest_possible_creation_date and latest_possible_creation_date;

And obviously, "creation date" (the date the artist created the work) is entirely differnet from "date depicted in work", if the latter can be determined at all.

tpdi
I think this start- and end-dates structure is the way to go. It seems the simplest and least presumptive. And to order by the mid-point between the two dates, do you think?(Your data examples are exceedingly apposite, I must say!)
Sam
A: 

I don't think any native MySQL date representation is going to work for you. Your two-column solution would work well if paired with a Unix time stamp (generated with the UNIX_TIMESTAMP() function with a MySQL date as the argument). Use the second column (the range width) for an upper and lower bound in your selects, and make sure the date column is indexed.

Jeremy DeGroot
A: 

In the end I decided upon: a column for each of the date components (year, month, day, hour, minute, second), and accompanying columns for the range of each of these (year_range, month_range, day_range, hour_range, minute_range, second_range), mainly because this method allows me to specify that I know for sure that a particular photo was taken in August (for instance) in the late '60s (year=1868, year_range=2, month=8, month_range=0).

Thank you all for your help!

Sam