tags:

views:

417

answers:

5

I have a table of data sorted by date, from which a user can select a set of data by supplying a start and end date. The data itself is non-continuous, in that I don't have data for weekends and public holidays.

I would like to be able to list all the days that I don't have data for in the extracted dataset. Is there an easy way, in Java, to go:

  1. Here is an ordered array of dates.
  2. This is the selected start date. (The first date in the array is not always the start date)
  3. This is the selected end date. (The last date in the array is not always the end date)
  4. Return a list of dates which have no data.
+1  A: 

You should be able to create a filtered iterator that provides this. Perhaps have the method for the iterator accept the start and stop date of your sub-collection. As for the actual implementation of the iterator, I can't think of anything much more elegant than a brute-force run at the whole collection once the start element has been found.

Nerdfest
+2  A: 

You could create a temp list and x it as needed.

(Not actual Java. Sorry, my memory of it is horrible.)

dates = [...]; // list you have now;

// build list
unused = [];
for (Date i = startdate; i < enddate; i += day) {
    unused.push(i);
}

// remove used dates
for (int j = 0; j < dates.length; j += 1) {
    if (unused.indexOf((Date) dates[j]) > -1) { // time = 00:00:00
        unused.remove(unused.indexOf((Date) dates[j]));
    }
}
Jonathan Lonowski
A: 

You can either create a list of all possible dates between start and end date and then remove dates which appear in the list of given data (works best when most dates are missing), or you can start with an empty list of dates and add ones that don't appear in the given data.

Either way, you basically iterate over the range of dates between the start date and end date, keeping track of where you are in the list of given dates. You could think of it as a 'merge-like' operation where you step through two lists in parallel, processing the records that appear in one list but not in the other. In pseudo-code, the empty list version might be:

# given   - array of given dates
# N       - number of dates in given array
# missing - array of dates missing

i = 0;    # Index into given date array
j = 0;    # Index into missing data array
for (current_date = start_date; current_date <= end_date; current_date++)
{
    while (given[i] < current_date && i < N)
        i++
    if (i >= N)
        break
    if (given[i] != current_date)
        missing[j++] = current_date
}
while (current_date < end_date)
{
    missing[j++] = current_date
    current_date++
}

I'm assuming that the date type is quantized in units of a day; that is, date + 1 (or date++) is the day after date.

Jonathan Leffler
A: 

While the other answers already given look rather simple and enjoyable and hold some good ideas (I especially agree with the Iterator suggestion by Nerdfest), I thought I'd give this a shot anyway and code a solution just to show how I'd do it for the first iteration, I'm sure there's room for improvement in what's below.

I also maybe took your requirements a bit too literally but you know how to adjust the code to your liking. Oh and sorry for horrible naming of objects. Also since this sample uses Calendar, remember that Calendar.roll() may not update the entire Calendar object in some cases so that's a potential bug right there.

protected List<Calendar> getDatesWithNoData(Calendar start, Calendar end,
  Calendar[] existingDates) throws ParseException {

 List<Calendar> missingData = new ArrayList<Calendar>();

 for(Calendar c=start ; c.compareTo(end)<=0 ; c.roll(Calendar.DAY_OF_MONTH, true) ) {

  if(!isInDataSet(c, existingDates)) {
   Calendar c2 = Calendar.getInstance();
   c2.setTimeInMillis(c.getTimeInMillis());

   missingData.add(c2);
  }
 }
 return missingData;
}

protected boolean isInDataSet(Calendar toSearch, Calendar[] dataSet) {
 for(Calendar l : dataSet) {
  if(toSearch.equals(l)) return true;
 }
 return false;
}
P Arrayah
A: 

Start with this: what's a date? Is it GMT or local?

If it's GMT, each day is the java.util.Date.getTime() value divided by 86400000. You can quickly run through your array, and add the resulting Long values to a TreeSet (which is sorted). Then iterate the TreeSet to find gaps.

If a date is local time, you'll have to add/subtract an appropriate offset before dividing.

kdgregory