views:

191

answers:

3

I need to quickly save a re-ordered sequence back to my items' integer sortOrder columns.

The simple renumber-by-one approach can be slow - if last item moved to first, all N rows are modified. A multi-row update statement would let database do the work, but I'd like to explore smarter ways, like making sortOrder floating point except I don't have that option :(

The solution I imagine would take a renumbered list like this: (100,200,1700,300,400...1600,1800,...) and produce (100,200,250,300,400...1600,1800,...) (by modifying one row in this example). It seems simple at first, but I'm having a hard time expressing it in code...

Can someone help me with this logic? There could be sequences of adjacent items that need to be shifted for a new one to fit - I was hoping someone might have this already written? It has to be faster than what I have now, but still readable and simple to understand/maintain.

OK, after answer, posting back with resulting code I came up with (comments welcome):

/**
 * Renumber list with minimal changes
 * 
 * Given a list of SortOrderNumbers in the 'new' sequence they are to be saved in, determine the
 * minimal set of changes (described by Change(i, newSortOrderNumber)) that can be saved that
 * results in a properly ordered sequence.
 * 
 * A simple answer would always be List(change<1,1>, change<2,2>, ...change<n,n>) which is of length N.
 * This method returns a set of changes no larger than N (often much smaller for simple moves).
 * 
 * @author Jim Pinkham
 * @param s
 * @return Set<Change>
 */
private Set<Change> renumber(int[] s) {
    Set<Change> ret = new HashSet<Change>();
    // pass1 goes from start forwards looking for decrease in numbering
    for (int i=1; i<s.length; i++) {
     // if predecessor of item is larger and it's large enough to renumber from start of list 
     if (s[i-1]>s[i] && s[i]>i) {       
      int beforeStart=0;
      int beforeIndex=-1;
      // go back towards start looking for anchor
      for (int j=i-2; j>=0; --j) {
       int diff = s[i]-s[j];
       if (diff>(i-j)) {
        beforeIndex=j;
        beforeStart=s[beforeIndex];
        break;
       }
      }
  int diff = s[i]-beforeStart;
  int stepsToDiff=i-beforeIndex;
      int delta = diff/stepsToDiff;
      // renumber from start of list or anchor thru decrease
      int fixCnt=0;
      for (int j=beforeIndex+1; j<i; ++j) {
       s[j] = beforeStart + (delta*++fixCnt); 
       System.out.println("s["+j+"]="+s[j]);
       ret.add(new Change(j, s[j]));
      }
     }
    }
    // pass1 could leave some decreases in sequence  
    // pass2 goes from end back to start
    for (int i=s.length-1; i>0; i--) {
     // if predecessor of item is larger 
     if (s[i-1] > s[i]) {
      int afterIndex=s.length;
      int delta=DEFAULT_RENUMBER_GAP;
      // go back towards end looking for anchor       
      for (int j=i; j<s.length; ++j) {
       int diff = s[j]-s[i-1];
       if (diff>(j-(i-1))) {
        afterIndex=j;
        int afterEnd=s[afterIndex];
        int stepsToDiff=afterIndex-(i-1);
        int gap = afterEnd-s[i-1];
           delta = gap/stepsToDiff;
        break;
       }
      }
      // renumber from decrease thru anchor or end of list
      int fixCnt=0;
      for (int j=i; j<afterIndex; ++j) {
       s[j] = s[i-1] + (delta*++fixCnt); 
       System.out.println("s["+j+"]="+s[j]);
       ret.add(new Change(j, s[j]));        
      }
     }
    }
    return ret;
}
class Change { 
    int i; 
    int sortOrder; 
    Change(int i, int sortOrder) { 
     this.i=i; this.sortOrder=sortOrder; 
    }
    public boolean equals(Change other) {
     return Integer.valueOf(i).equals(Integer.valueOf(other.i));
    }
    public int hashCode() {
     return Integer.valueOf(i).hashCode();
    }
}
A: 

An additional level of indirection may help, e.g. implement a relocatable handle to each row in place of a row index. So instead of dealing with row[x], deal with row[handle[x]]

edit: ok so this is not possible in your situation...can you clarify then how much reordering you expect?

I gather from the phrasing of the question that you expect only M of N items to move, where M is significantly less than N. So you want to avoid N updates - you'd rather have something like M updates.

If M is less than N/2 then it should be faster to define the reordering in terms of swap operations. You don't say what your UI is, but the user is probably effectively doing swap operations there anyhow. So by recording those, or using a standard sort algorithm to get from the original state to the desired state, you should be able to generate the set of M swap operations needed to reorder the elements. That should only require M*2 row updates - i.e. if only two items trade places you need update only 2 rows.

There may be some degenerate cases where this is actually slower than just rewriting everything though - seems unlikely though if as implied by the question it is just the user reordering stuff.

frankodwyer
In other words, use an external index - might help someone else, but that's not really possible in my situation. My main problem is, I can only achieve about 5 rec/sec update speed so save is taking 6+ minutes in some cases. Hoping for a reference to some renumbering algorithm.
Jim P
updated my answer - see if the new version helps
frankodwyer
I appreciate the idea, but I don't think "if only two items trade places you need update only 2 rows" applies unless I were using a linked list.
Jim P
actually it is true independent of the datastructure used- however working it on paper I see it still suffers from degenerate cases such as the one you mention, e.g. moving last to first will require enough swaps to touch every row, so it won't work for what you want.I think I understand what you are trying to do with the gaps now - I have a different idea that might work for you. Will post that as a different answer.
frankodwyer
+1  A: 

Following your example you could do something like this:

Walk your numbers array. If the successor of an item x is smaller than x itself walk the array backwards until you find the item y with the minimum difference between y and x+1. Count the steps you walked backwards, take the minimum distance, walk forewards from y and set the items to y+((x+1)-y)/count.

DaClown
Thanks, that's the kind of idea I was looking for.
Jim P
+1  A: 

I'd like to explore smarter ways, like making sortOrder floating point except I don't have that option

If you find it easier to think of it in terms of floating point, why not imagine the number as fixed point.

e.g. for the purposes of your algorithm interpret 1000000 as 100.0000. You'll need to choose the point position so that there as many decimal (or binary) places as you can fit given (max number of items in your array+2) vs the integer size. So let's say the max number of entries is 998, you'd need 3 digits before the point, the rest would be available for 'gaps'.

A move operation then can be as simple as setting its new sortnumber to half the sum of the sortnumber of the items either side, i.e. slotting the moved item between its new neighbors. Use 0 and size(array)+1 as the end cases. Again I'm assuming that your UI can record the moves done by the user - regardless I think it should be fairly straightforward to work them out, and a standard sort algorithm could probably be used, just redefine 'swap'.

So for example moving last to first in this array (with imaginary decimal point):

1.0000 2.0000 3.0000 4.0000 5.0000

becomes

1.0000 2.0000 3.0000 4.0000 0.5000 = (0.0000 + 1.0000)/2

giving a sort order of

0.5000 1.0000 2.0000 3.0000 4.0000

Which changes just one record, the last one in the array

Moving last to second would do this:

1.0000 2.0000 3.0000 4.0000 5.0000

Becomes

1.0000 2.0000 3.0000 4.0000 1.5000 = (1.0000+2.0000)/2

resulting in a sort order of

1.0000 1.5000 2.0000 3.0000 4.0000

Again, just one record changed.

You will still need to cater for the case where you you run out of room 'between' two numbers, which you will eventually. I think this is true regardless of algorithm. This will require 'swap' to renumber more entries to make more room. Again regardless of algorithm I don't think you can rule out the case where everything has to be renumbered, it will just be very unlikely. I also suspect that extensive renumbers become more likely over time, again no matter what you do - the available space will fragment. However by choosing the position of the point to give as much room as possible, it should be optimal, i.e. you postpone that as long as possible.

To avoid having to do a more extensive renumber at an inconvenient time, it would probably be advisable to regularly do some kind of batch renumber during quiet periods - basically stretching the gaps again to make room for further user driven sorts. Again, I think you probably need this no matter what method you use.

This is just a sketch and I think it is probably equivalent to any other way of doing it, though perhaps a more intuitive/maintainable way of thinking about it and a way of maximising the room for expansion. Also if you're really worried about poor performance of degenerate cases - and from your description it sounds like you should be - I'd suggest to run whatever algorithm you go with in a test harness with a lot of random data (no database) over a long period, to see how many renumbers it really performs in practice and especially to see if it degrades with use over a long period. I suspect any algorithm for this will.

frankodwyer
Thanks for taking the time to make this suggestion. I had considered almost exactly this idea - as you mention eventually you need the renumbering. I think I came up with a hybrid approach by using a large RENUMBER_GAP I'll end up with the 'float approximiation' in most cases, but it also handles the renumber when it needs to.
Jim P